Search in:

International Journal of Computers and Applications Volume 44, 2022 - Issue 12: Artificial Intelligence for Sustainable Internet Research. Guest Editors: Dr. H. Anandakumar, Dr. Muhammad Sharif and Dr. Sri Devi Ravana

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Articles

Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data

Shashidhar Virupakshaa Department of CSE, VFSTR (Deemed to be University), Guntur, India;b Department of CSE, Presidency University, Bengaluru, IndiaCorrespondence[email protected]

https://orcid.org/0000-0002-3116-541X View further author information

D. Venkatesulua Department of CSE, VFSTR (Deemed to be University), Guntur, India

https://orcid.org/0000-0002-8053-0102 View further author information

Pages 1130-1139 | Received 07 Aug 2019, Accepted 24 Oct 2019, Published online: 18 Nov 2019

Cite this article
https://doi.org/10.1080/1206212X.2019.1686211
CrossMark

Sample our Engineering & Technology journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/1206212X.2019.1686211?needAccess=true

ABSTRACT

Clustering is a data mining technique that has been effectively used in the last few decades for knowledge extraction. Privacy is a major problem while releasing data for clustering and therefore privacy-preserving data mining (PPDM) algorithms have been developed. Aggregation is a popular PPDM technique that has been used. However, in the last few years, certain applications require that data mining be performed on high-dimensional data. The present privacy preservation techniques perform aggregation in a univariate manner along each dimension. This affects the utility measures, information measures, and especially retention of original clusters. This paper proposes a new technique called as subspace-based aggregation (SBA). SBA categorizes the dimensions into dense and non-dense subspaces based on the density of points. Aggregation is performed separately for dense and non-dense subspaces. This approach helps to maximize utility measures, information measures, and retention of clusters. SBA is run on high-dimensional continuous datasets from UCI Machine Learning repository. SBA is compared with related work methods such as SINGLE, SIMPLE, MDAV, and PPPCA. SBA provides an improvement of 66% in utility, 400% in cluster identification, 5% in co-variance, and standard deviation.

KEYWORDS:

Privacy preservation
privacy preserved data mining
Data privacy

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Shashidhar Virupaksha

Shashidhar Virupaksha has received his BTech degree in Computers Science Engineering from SASTRA University. He received his Masters in Computer Science Engineering from BIT Mesra. He has worked in WIPRO Technologies for Information Security and Data privacy projects. He was involved in finding critical flaws in ESSO Software. He was Head of the Department CSE at VLITS Guntur. Presently he is working in Presidency University Bengaluru. He is pursuing his PhD from VFSTR Deemed to be University, Guntur Andhra Pradesh. He has publications in Springer and IEEE Conferences.

D. Venkatesulu

D. Venkatesulu received his MTech degree (1988) from Andhra University, Visakhapatnam and PhD (1999) from IIT, Madras. He worked in IT industry for a period of 18 years and he is currently working as a Professor and Head in the Department of Computer Science and Engineering, VFSTR Deemed to be University, Guntur Andhra Pradesh. His areas of interest include distributed systems, Data mining and Wireless sensor networks.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related Research Data

a review

Source: Association for Computing Machinery (ACM)

m-Privacy for Collaborative Data Publishing

Source: IEEE

An illustration to secured way of data mining using privacy preserving data mining

Source: Informa UK Limited

On the disclosure risk of multivariate microaggregation

Source: Elsevier Science Ltd.

Privacy-preserved community discovery in online social networks

Source: Elsevier BV

Equally contributory privacy-preserving k-means clustering over vertically partitioned data

Source: Elsevier BV

Minimum spanning tree partitioning algorithm for microaggregation

Source: Institute of Electrical and Electronics Engineers (IEEE)

A High-Order Clustering Algorithm Based on Dropout Deep Learning for Heterogeneous Data in Cyber-Physical-Social Systems

Source: Institute of Electrical and Electronics Engineers (IEEE)

Privacy-preserving trajectory stream publishing

Source: Elsevier BV

ANGEL: Enhancing the Utility of Generalization for Privacy Preserving Publication

Source: Institute of Electrical and Electronics Engineers (IEEE)

Privacy-preserving data publishing for cluster analysis

Source: Elsevier BV

Hierarchical Grouping to Optimize an Objective Function

Source: Informa UK Limited

Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro

Source: Foundation for Open Access Statistic

A data distortion by probability distribution

Source: Association for Computing Machinery (ACM)

Practical data-oriented microaggregation for statistical disclosure control

Source: Institute of Electrical and Electronics Engineers (IEEE)

Hierarchical Grouping to Optimize an Objective Function

Source: Informa UK Limited

Mutual Privacy Preserving $k$ -Means Clustering in Social Participatory Sensing

Source: Institute of Electrical and Electronics Engineers (IEEE)

Statistical disclosure control via sufficiency under the multiple linear regression model

Source: Springer Science and Business Media LLC

Random projection-based multiplicative data perturbation for privacy preserving distributed data mining

Source: Institute of Electrical and Electronics Engineers (IEEE)

An Algorithm for Euclidean Sum of Squares Classification

Source: JSTOR

k -anonymity: a model for protecting privacy

Source: World Scientific Pub Co Pte Lt

Tools for privacy preserving distributed data mining

Source: Association for Computing Machinery (ACM)

A Dual Privacy Preserving Approach for Location-Based Services in Mobile Multicast Environment

Source: Springer Science and Business Media LLC

Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Source: Springer Science and Business Media LLC

A METHOD FOR CLUSTER ANALYSIS

Source: JSTOR

Multidimensional group analysis

Source: CSIRO Publishing

Minimum Sum of Squares Clustering in a Low Dimensional Space

Source: Springer Science and Business Media LLC

Linking provided by

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data

Notes on contributors

Shashidhar Virupaksha

D. Venkatesulu

Related Research Data

Information for

Open access

Opportunities

Help and information

Subspace-based aggregation for enhancing utility, information measures, and cluster identification in privacy preserved data mining on high-dimensional continuous data

ABSTRACT

Disclosure statement

Additional information

Notes on contributors

Shashidhar Virupaksha

D. Venkatesulu

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature