Search in:

International Journal of Parallel, Emergent and Distributed Systems Volume 34, 2019 - Issue 6: Clouds for Scalable Big Data Processing

Submit an article Journal homepage

455

Views

CrossRef citations to date

Altmetric

Original Articles

Parallel and distributed clustering framework for big spatial data mining

Malika Bendechache Insight Centre for Data Analytics, University College Dublin , Dublin, Ireland.Correspondence[email protected]
View further author information

A-Kamel Tari University A-Mira of Bejaia , Bejaia, Algeria.View further author information

M-Tahar Kechadi Insight Centre for Data Analytics, University College Dublin , Dublin, Ireland.View further author information

Pages 671-689 | Received 10 Oct 2017, Accepted 23 Feb 2018, Published online: 16 Mar 2018

Cite this article
https://doi.org/10.1080/17445760.2018.1446210
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Big data and analytics builds the foundation for cognitive ; 2017. Available from: http://1www.idc.com/prodserv/4Pillars/bigdata
Google Scholar
Han J , Pei J , Kamber M . Introduction. In: Data mining: concepts and techniques. Elsevier; 2011. p. 1–39.
Google Scholar
Bacarella D . Distributed clustering algorithm for large scale clustering problems. 2013. Available frm: urn:nbn:se:uu:diva-212089
Google Scholar
Tsoumakas G , Vlahavas I . Distributed data mining. In: Encyclopedia of Data Warehousing and Mining; 2009.
Google Scholar
Fu Y . Distributed data mining: an overview. In: Newsletter of the IEEE Technical Committee on Distributed Processing. Rolla, MO; 2001. p. 5.
Google Scholar
Park BH , Kargupta H . Distributed data mining: algorithms, systems, and applications. Baltimore (MD): Citeseer; 2002.
Google Scholar
Karine Zeitouni LY . Le data mining spatial et les bases de données spatiales. Revue internationale de géomatique. 1999;9:389–423.
Google Scholar
Ghosh S . Distributed systems: an algorithmic approach. Boca Raton (FL): Chapman & Hall; 2014.
Google Scholar
Aouad L , Le-Khac NA , Kechadi T . Image analysis platform for data management in the meteorological domain. In: 7th Industrial Conference in Data Mining Proceedings. Vol. 4597; Berlin Heidelberg: Springer; 2007. p. 120–134.
Google Scholar
Wu X , Zhu X , Wu GQ , et al . Data mining with Big Data. IEEE Trans Knowl Data Eng. 2014;26:97–107.
Google Scholar
Rokach L , Schclar A , Itach E . Ensemble methods for multi-label classification. Expert Syst Appl. 2014;41:7507–7523.
Google Scholar
Bauer E , Kohavi R . An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach Learn. 1999;36:105–139.
Google Scholar
Huang JW , Lin SC , Chen MS . Dpsp: distributed progressive sequential pattern mining on the cloud. Adv Knowl Discovery Data Min. 2010:27–34.
Google Scholar
Yang XY , Liu Z , Fu Y . MapReduce as a programming model for association rules algorithm on Hadoop. In: 2010 3rd International Conference on Information Sciences and Interaction Sciences (ICIS). Chengdu, China; 2010. p. 99–102.
Google Scholar
Lin X . Mr-apriori: association rules algorithm based on mapreduce. In: 2014 5th IEEE International Conference on Software Engineering and Service Science (ICSESS). Beijing, China; 2014. p. 141–144.
Google Scholar
Hsieh LC , Wu GL , Hsu YM , et al . Online image search result grouping with mapreduce-based image clustering and graph construction for large-scale photos. J Visual Commun Image Representation. 2014;25:384–395.
Google Scholar
He Y , Tan H , Luo W , et al . Mr-dbscan: a scalable mapreduce-based dbscan algorithm for heavily skewed data. Front Comput Sci. 2014;8:83–99.
Google Scholar
Sun T , Shu C , Li F , et al. An efficient hierarchical clustering method for large datasets with map-reduce. In: 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies. Hiroshima, Japan; 2009. p. 494–499.
Google Scholar
Kim Y , Shim K , Kim MS , et al . Dbcure-mr: an efficient density-based clustering algorithm for large data using mapreduce. Inf Syst. 2014 Jun;42:15–35.
Google Scholar
Bendechache M , Kechadi MT . Distributed clustering algorithm for spatial data mining. In: 2nd International Conference on Spatial Data Mining and Geographical Knowledge Services (ICSDM). Fuzhou, China: IEEE; 2015. p. 60–65.
Google Scholar
Bendechache M , Le-Khac NA , Kechadi MT . Efficient large scale clustering based on data partitioning. In: International Conference on Data Science and Advanced Analytics (DSAA); IEEE; 2016. p. 612–621.
Google Scholar
Bendechache M , Le-Khac NA , Kechadi MT . Performance evaluation of a distributed clustering approach for spatial datasets. In: 15th International Conference on Australasian Data Mining Conference (AusDM), CRPIT. Melbourne, Australia; 2017.
Google Scholar
Lloyd S . Least squares quantization in pcm. IEEE Trans Inf Theory. 1982;28:129–137.
Google Scholar
Kaufman L , Rousseeuw PJ . Partitioning around medoids (program pam). In: Finding groups in data: an introduction to cluster analysis. Wiley Online Library; 1990. p. 68–125.
Google Scholar
Guha S , Rastogi R , Shim K . CURE: an efficient clustering algorithm for large databases. In: Tiwary A , Franklin M , editors. ACM Sigmod Record. Vol. 27. New York (NY): ACM; 1998. p. 73–84
Google Scholar
Zhang T , Ramakrishnan R , Livny M . BIRCH: an efficient data clustering method for very large databases. In: Widom J , editor. ACM Sigmod Record. Vol. 25. New York (NY): ACM; 1996. p. 103–114.
Google Scholar
Ester M , Kriegel HP , Sander J , et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96; Portland, Oregon: AAAI Press; 1996. p. 226–231.
Google Scholar
Aouad L , Le-Khac NA , Kechadi T . Lightweight clustering technique for distributed data mining applications. In: Advances in data mining. Theoretical aspects and applications; Springer; 2007. p. 120–134.
Google Scholar
Dhillon I , Modha D . A data-clustering algorithm on distributed memory multiprocessor. In: Large-scale parallel data mining, workshop on large-scale parallel KDD systems, SIGKDD. London, UK: Springer-Verlag; 1999. p. 245–260.
Google Scholar
Garg A , Mangla A , Bhatnagar V , et al. Pbirch: a scalable parallel clustering algorithm for incremental data. In: 10th International Symposium on Database Engineering and Applications (IDEAS-06). Delhi, India; 2006. p. 315–316.
Google Scholar
Geng H , Deng X , Ali H . A new clustering algorithm using message passing and its applications in analyzing microarray data. In: Proceedings. Fourth International Conference on Machine Learning and Applications. Los Angeles (CA): IEEE; 2005. 6 p.
Google Scholar
Dhillon ID , Modha DS . A data-clustering algorithm on distributed memory multiprocessors. In: Zaki MJ , Ho C-T, editors. Large-scale parallel data mining. Berlin Heidelberg: Springer; 2000. p. 245–260.
Google Scholar
Xu X , Jaeger J , Kriegel HP . A fast parallel clustering algorithm for large spatial databases. Data Min Knowl Discovery Arch. 1999;3:263–290.
Google Scholar
Laloux JF , Le-Khac NA , Kechadi MT . Efficient distributed approach for density-based clustering. In: Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 20th IEEE International Workshops. Paris, France; 2011. p. 145–150.
Google Scholar
Le Khac NA , Aouad LM , Kechadi MT . Knowledge map layer for distributed data mining. J ISAST Trans Intell Syst. 2008;1:98–107.
Google Scholar
Roddick JF , Hornsby K , Spiliopoulou M . An updated bibliography of temporal, spatial, and spatio-temporal data mining research. In: Roddick JF , Hornsby K , editors. Temporal, spatial, and spatio-temporal data mining. Berlin, Heidelberg: Springer; 2001. p. 147–163.
Google Scholar
Kivinen J , Mannila H . The power of sampling in knowledge discovery. In: Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems. Minneapolis (MN): ACM; 1994. p. 77–85.
Google Scholar
Compieta P , Di Martino S , Bertolotto M , et al . Exploratory spatio-temporal data mining and visualization. J Visual Lang Comput. 2007;18:255–279.
Google Scholar
He Y , Tan H , Luo W , et al . Mr-dbscan: an efficient parallel density-based clustering algorithm using mapreduce. In: 17th International Conference on Parallel and Distributed Systems. Tainan, Taiwan: IEEE; 2011. p. 473–480.
Google Scholar
Bendechache M , Le-Khac NA , Kechadi MT . Hierarchical aggregation approach for distributed clustering of spatial datasets. In: 16th International Conference on Data Mining Workshops (ICDMW). Barcelona, Spain: IEEE; 2016. p. 1098–1103.
Google Scholar
Chaudhuri A , Chaudhuri B , Parui S . A novel approach to computation of the shape of a dot pattern and extraction of its perceptual border. Comput Vision Image Understanding. 1997;68:257–275.
Google Scholar
Melkemi M , Djebali M . Computing the shape of a planar points set. Pattern Recognit. 2000;33:1423–1436.
Google Scholar
Duckhama M , Kulikb L , Worboysc M , et al . Efficient generation of simple polygons for characterizing the shape of a set of points in the plane. Vol. 41. New York (NY): Elsevier Science Inc.; 2008. p. 3224–3236.
Google Scholar
Zhang T , Ramakrishnan R , Livny M . Birch: an efficient data clustering method for very large databases. In: SIGMOD-96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data. Vol. 25; New York, NY, USA: ACM; 1996. p. 103–114.
Google Scholar
Guha S , Rastogi R , Shim K . CURE: an efficient clustering algorithm for large databases. In: Guha S , Rastogi R , Shim K , editors. Information systems. Vol. 26. Oxford: Elsevier Science Ltd.; 2001. p. 35–58.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Parallel and distributed clustering framework for big spatial data mining

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Parallel and distributed clustering framework for big spatial data mining

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date