79
Views
5
CrossRef citations to date
0
Altmetric
Research Article

A new interpoint distance-based clustering algorithm using kernel density estimation

ORCID Icon
Received 24 Feb 2022, Accepted 04 Feb 2023, Published online: 15 Feb 2023

References

  • Alon, U., N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine. 1999. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 96 (12):6745–50. doi:10.1073/pnas.96.12.6745.
  • Arias-Castro, E., D. Mason, and B. Pelletier. 2016. On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Journal of Machine Learning Research 17:1487–514.
  • Bandyopadhyay, U., and S. Modak. 2018. Bivariate density estimation using normal-gamma kernel with application to astronomy. Journal of Applied Probability and Statistics 13:23–39.
  • Bezdek, J. C. 1981. Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press.
  • Campello, R. J. G. B., D. Moulavi, and J. Sander. 2013. Density-based clustering based on hierarchical density estimates. Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery in Databases (PAKDD 2013). Lecture Notes in Computer Science. Springer, Berlin, vol. 7819, 160–72.
  • Chen, S. X., and Y.-L. Qin. 2010. A two-sample test for high-dimensional data with applications to gene-set testing. The Annals of Statistics 38:808–35. doi:10.1214/09-AOS716.
  • Cheng, D., Q. Zhu, J. Huang, Q. Wu, and L. Yang. 2021. Clustering with local density peaks-based minimum spanning tree. IEEE Transactions on Knowledge and Data Engineering 33:374–87. doi:10.1109/TKDE.2019.2930056.
  • Dunn, J. C. 1974. Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4:95–104.
  • Ester, M., H.-P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press, Portland, OR, 226–31.
  • Gower, J. C. 1971. A general coefficient of similarity and some of its properties. Biometrics 27:623–37. doi:10.2307/2528823.
  • Handl, J., K. Knowles, and D. Kell. 2005. Computational cluster validation in post-genomic data analysis. Bioinformatics (Oxford, England) 21 (15):3201–12. doi:10.1093/bioinformatics/bti517.
  • Hartigan, J. A. 1975. Clustering algorithms. New York: John Wiley & Sons.
  • Hartigan, J. A., and M. A. Wong. 1979. A K-means clustering algorithm. Applied Statistics 28:100–8. doi:10.2307/2346830.
  • Hahsler, M., M. Piekenbrock, and D. Doran. 2019. dbscan: Fast density-based clustering with R. Journal of Statistical Software 91 (1):30. doi:10.18637/jss.v091.i01.
  • Jain, A. K., M. N. Murty, and P. J. Flynn. 1999. Data clustering: A review. ACM Computing Surveys 31:264–323. doi:10.1145/331499.331504.
  • Karney, C. F. F. 2013. Algorithms for geodesics. Journal of Geodesy 87:43–55. doi:10.1007/s00190-012-0578-z.
  • Kaufman, L., and P. J. Rousseeuw. 2005. Finding groups in data: An introduction to cluster analysis. Hoboken, NJ: John Wiley and Sons.
  • MacQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, ed. L. M. Le Cam and J. Neyman, vol. 1, 281–97. Berkeley, CA: University of California Press.
  • Marozzi, M. 2015. Multivariate multidistance tests for high-dimensional low sample size case-control studies. Statistics in Medicine 34 (9):1511–26. doi:10.1002/sim.6418.
  • Matioli, L. C., S. R. Santos, M. Kleina, and E. A. Leite. 2018. A new algorithm for clustering based on kernel density estimation. Journal of Applied Statistics 45:347–66.
  • McLachlan, G., and D. Peel. 2000. Finite mixture models. New York: John Wiley and Sons.
  • Modak, S. 2019. Uncovering astrophysical phenomena related to galaxies and other objects through statistical analysis. PhD thesis. http://hdl.handle.net/10603/314773.
  • Modak, S. 2021. Distinction of groups of gamma-ray bursts in the BATSE catalog through fuzzy clustering. Astronomy and Computing 34:100441.
  • Modak, S. 2022a. A new nonparametric interpoint distance-based measure for assessment of clustering. Journal of Statistical Computation and Simulation 9:1062–77.
  • Modak, S. 2022b. A new measure for assessment of clustering based on kernel density estimation. Communications in Statistics – Theory and Methods. doi:10.1080/03610926.2022.2032168.
  • Modak, S., and U. Bandyopadhyay. 2019. A new nonparametric test for two sample multivariate location problem with application to astronomy. Journal of Statistical Theory and Applications 18:136–46.
  • Modak, S., A. K. Chattopadhyay, and T. Chattopadhyay. 2018. Clustering of gamma-ray bursts through kernel principal component analysis. Communications in Statistics – Simulation and Computation 47:1088–102.
  • Modak, S., T. Chattopadhyay, and A. K. Chattopadhyay. 2017. Two phase formation of massive elliptical galaxies: Study through cross–correlation including spatial effect. Astrophysics and Space Science 362:206.
  • Modak, S., T. Chattopadhyay, and A. K. Chattopadhyay. 2020. Unsupervised classification of eclipsing binary light curves through k-medoids clustering. Journal of Applied Statistics 47 (2):376–92. doi:10.1080/02664763.2019.1635574.
  • Modak, S., T. Chattopadhyay, and A. K. Chattopadhyay. 2022. Clustering of eclipsing binary light curves through functional principal component analysis. Astrophysics and Space Science 367:1–10.
  • Nelsen, R. B. 2006. An introduction to copulas. 2nd ed. New York: Springer Science + Business.
  • Rousseeuw, P. J. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20:53–65.
  • Ruspini, E. H. 1970. Numerical methods for fuzzy clustering. Information Sciences 2:319–50.
  • Schölkopf, B., and A. J. Smola. 2002. Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge: MIT Press.
  • Silverman, B. W. 1986. Density estimation for statistics and data analysis. London: Chapman and Hall.
  • Tarnopolski, M. 2019. Analysis of the duration–hardness ratio plane of gamma-ray bursts using skewed distributions. The Astrophysical Journal 870:105.
  • Tóth, B. G., I. I. Rácz, and I. Horváth. 2019. Gaussian-mixture-model-based cluster analysis of gamma-ray bursts in the BATSE catalog. Monthly Notices of the Royal Astronomical Society 486:4823–8.
  • Wand, M. P., and M. C. Jones. 1995. Kernel smoothing. New York: Chapman and Hall.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.