455
Views
22
CrossRef citations to date
0
Altmetric
Original Articles

Parallel and distributed clustering framework for big spatial data mining

, &
Pages 671-689 | Received 10 Oct 2017, Accepted 23 Feb 2018, Published online: 16 Mar 2018
 

ABSTRACT

Clustering techniques are very attractive for identifying and extracting patterns of interests from datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality, heterogeneity, and high complexity of some algorithms. Distributed clustering techniques constitute a very good alternative to the Big Data challenges (e.g., Volume, Variety, Veracity, and Velocity). In this paper, we developed and implemented a Dynamic Parallel and Distributed clustering (DPDC) approach that can analyse Big Data within a reasonable response time and produce accurate results, by using existing and current computing and storage infrastructure, such as cloud computing. The DPDC approach consists of two phases. The first phase is fully parallel and it generates local clusters and the second phase aggregates the local results to obtain global clusters. The aggregation phase is designed in such a way that the final clusters are compact and accurate while the overall process is efficient in time and memory allocation. DPDC was thoroughly tested and compared to well-known clustering algorithms BIRCH and CURE. The results show that the approach not only produces high-quality results but also scales up very well by taking advantage of the Hadoop MapReduce paradigm or any distributed system.

GRAPHICAL ABSTRACT

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The research work is conducted in the Insight Centre for Data Analytics, which is supported by Science Foundation Ireland [grant number SFI/12/RC/2289].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 763.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.