Efficient hiding of confidential high-utility itemsets with minimal side effects

Jerry Chun-Wei LinFujian Provincial Key Laboatory of Big Data Mining and Applications, Fujian University of Technology, Fujian, China.;School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China.Correspondence[email protected]
View further author information

Tzung-Pei HongDepartment of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan.;Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan.View further author information

Philippe Fournier-VigerSchool of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China.

http://orcid.org/0000-0002-7680-9899 View further author information

Qiankun LiuSchool of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China.View further author information

Jia-Wei WongDepartment of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan.View further author information

Justin ZhanDepartment of Computer Science, University of Nevada, Las Vegas, NV, USA.View further author information

Abstract

Privacy preserving data mining (PPDM) is an emerging research problem that has become critical in the last decades. PPDM consists of hiding sensitive information to ensure that it cannot be discovered by data mining algorithms. Several PPDM algorithms have been developed. Most of them are designed for hiding sensitive frequent itemsets or association rules. Hiding sensitive information in a database can have several side effects such as hiding other non-sensitive information and introducing redundant information. Finding the set of itemsets or transactions to be sanitised that minimises side effects is an NP-hard problem. In this paper, a genetic algorithm (GA) using transaction deletion is designed to hide sensitive high-utility itemsets for PPUM. A flexible fitness function with three adjustable weights is used to evaluate the goodness of each chromosome for hiding sensitive high-utility itemsets. To speed up the evolution process, the pre-large concept is adopted in the designed algorithm. It reduces the number of database scans required for verifying the goodness of an evaluated chromosome. Substantial experiments are conducted to compare the performance of the designed GA approach (with/without the pre-large concept), with a GA-based approach relying on transaction insertion and a non-evolutionary algorithm, in terms of execution time, side effects, database integrity and utility integrity. Results demonstrate that the proposed algorithm hides sensitive high-utility itemsets with fewer side effects than previous studies, while preserving high database and utility integrity.

Keywords:

Notes

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was partially supported by the open fund of Fujian Provincial Key Laboratory of Big Data Mining and Applications (Fujian University of Technology); the National Natural Science Foundation of China (NSFC) under [grant number 61503092]; the Tencent Project under [grant number CCF-Tencent IAGR20160115].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Efficient hiding of confidential high-utility itemsets with minimal side effects

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Efficient hiding of confidential high-utility itemsets with minimal side effects

Abstract

Notes

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date