78
Views
3
CrossRef citations to date
0
Altmetric
Original Articles

The Effectiveness of Data Shuffling for Privacy-Preserving Data Mining Applications

Pages 3-17 | Published online: 07 Jul 2014
 

Abstract

Preserving the confidentiality of sensitive data, while permitting knowledge discovery, is an important goal in privacy-preserving data mining. This paper investigates the effectiveness of data shuffling for classification tree and regression analysis. We compare the effectiveness of data shuffling to the tree based data perturbation method which was developed specifically for the purpose of data mining. Results suggest that data shuffling provides the higher levels of data security and more effectively preserves data mining knowledge than tree based data perturbation method.

Additional information

Notes on contributors

Han Li

Han Li is currently an assistant professor in School of Business Administration at Minnesota State University Moorhead. She received her doctorate in Management Information Systems from Oklahoma State University. She has published in Decision Support Systems, Operations Research, European Journal of Information Systems, Journal of Computer Information Systems, Information Management & Computer Security, and Journal of Information Privacy and Security. Her current research interests include Heath IT, privacy and confidentiality, data and information security and the adoption of information technology.

Krishnamurty Muralidhar

Krish Muralidhar is Gatton research professor at the School of Management, University of Kentucky. He received his PhD from Texas A&M University. His primary research interest is in data privacy and related areas. His research has appeared in journals such as ACM Transactions on Database Systems, Information Systems Research, Journal of Management, Management Science, and Operations Research.

Rathindra Sarathy

Rathindra Sarathy is Ardmore Chair and Professor of Information systems in the Department of Management Science and Information Systems at Oklahoma State University. He received his PhD from Texas A&M University. His research interests include database confidentiality, distributed databases, and e-commerce. His work has appeared in journals such as ACM Transactions on Database Systems, Decision Sciences, Decision Support Systems, European Journal of Operations Research, Information Systems Research, Management Science, and Operations Research.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.