ABSTRACT
We introduce a density-aided clustering method called Skeleton Clustering that can detect clusters in multivariate and even high-dimensional data with irregular shapes. To bypass the curse of dimensionality, we propose surrogate density measures that are less dependent on the dimension but have intuitive geometric interpretations. The clustering framework constructs a concise representation of the given data as an intermediate step and can be thought of as a combination of prototype methods, density-based clustering, and hierarchical clustering. We show by theoretical analysis and empirical studies that the skeleton clustering leads to reliable clusters in multivariate and high-dimensional scenarios. Supplementary materials for this article are available online.
Supplementary Materials
The supplementary materials contain additional theoretical results, the proofs, and additional empirical results. R code implementation is also included.
Disclosure Statement
The authors report there are no competing interests to declare.
Funding
Yen-Chi Chen is supported by NSF DMS-195278, 2112907, 2141808, and NIH U24-AG072122. Zeyu Wei is supported by NSF DMS-2112907.