Abstract
Clustering analysis is a useful way to group similar data together according to the information on the distance between data for possible further analysis. However, with the issue of big data, traditional clustering algorithms are restricted because of the problems of computational time, storage, and memory. Several distributed clustering algorithms have been proposed to consider this problem, but previous studies have mainly considered homogeneous data or crisp clustering. In this study, we propose the fuzzy clustering with hierarchical structure (FCHS) algorithm based on fuzzy c-means and the dendrogram to deal with the problems of soft clustering and heterogeneous databases. The empirical cases indicate that the result of the FCHS algorithm is consistent with that of central soft clustering and that FCHS can be appropriately used in heterogeneous databases.