Search in:

Advanced search

Journal of Applied Statistics Volume 51, 2024 - Issue 4

Submit an article Journal homepage

120

Views

CrossRef citations to date

Altmetric

Articles

TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimation

D. Scaldelaia Colegiado de Matemática, Universidade Estadual do Paraná–UNESPAR, Campo Mourão, BrazilCorrespondence[email protected]

https://orcid.org/0000-0003-2988-2716 View further author information

L. C. Matiolib Departamento de Matemática, Universidade Federal do Paraná–UFPR, Curitiba, Brazil

https://orcid.org/0000-0002-6506-3550 View further author information

S. R. Santosa Colegiado de Matemática, Universidade Estadual do Paraná–UNESPAR, Campo Mourão, Brazil

https://orcid.org/0000-0002-0509-9738 View further author information

Pages 740-758 | Received 09 Nov 2021, Accepted 12 Dec 2022, Published online: 22 Dec 2022

Cite this article
https://doi.org/10.1080/02664763.2022.2159339
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, Automatic subspace clustering of high dimensional data, Data. Min. Knowl. Discov. 11 (2005), pp. 5–33. doi: 10.1007/s10618-005-1396-1.
Web of Science ®Google Scholar
M. Ankerst, M.M. Breunig, H.P. Kriegel, and J. Sander, OPTICS, ACM SIGMOD Rec. 28 (1999), pp. 49–60. doi: 10.1145/304181.304187.
Google Scholar
A. Asuncion and D. Newman, Uci machine learning repository (2007). Available at http://archive.ics.uci.edu/ml.
Google Scholar
J.E. Chacón and T. Duong, Multivariate Kernel Smoothing and Its Applications, CRC Press, 2018.
Google Scholar
S.K. Chinnamgari, R Machine Learning Projects: Implement Supervised, Unsupervised, and Reinforcement Learning Techniques Using R 3.5, Packt Publishing Ltd, 2019.
Google Scholar
D. Defays, An efficient algorithm for a complete link method, Comput. J. 20 (1977), pp. 364–366. doi: 10.1093/comjnl/20.4.364.
Web of Science ®Google Scholar
M. Ester, H.P. Kriegel, J. Sander, and X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Kdd, Vol. 96, 1996, pp. 226–231.
Google Scholar
G. Gan, C. Ma, and J. Wu, Data Clustering: Theory, Algorithms, and Applications, SIAM, 2020.
Google Scholar
A. Gramacki, Nonparametric Kernel Density Estimation and Its Computational Aspects, Springer, 2018. doi: 10.1007/978-3-319-71688-6.
Google Scholar
J. Han, M. Kamber, and J. Pei, Data Mining Concepts and Techniques, 3rd ed., Morgan Kaufmann, Waltham-USA, 2011.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Science & Business Media, 2009.
Google Scholar
H. Huang, C. Ding, D. Luo, and T. Li, Simultaneous tensor subspace selection and clustering: The equivalence of high order svd and k-means clustering, in Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, pp. 327–335. doi: 10.1145/1401890.1401933.
Google Scholar
L. Hubert and P. Arabie, Comparing partitions, J. Classif. 2 (1985), pp. 193–218.
Web of Science ®Google Scholar
S. Itani, F. Lecron, and P. Fortemps, A one-class classification decision tree based on kernel density estimation, Appl. Soft. Comput. 91 (2020), pp. 106250.
Web of Science ®Google Scholar
A. Kassambara, Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning, Vol. 1, STHDA, 2017.
Google Scholar
L. Kaufman and P. Rousseeuw, Clustering large data sets, in Pattern Recognition in Practice II (1986), E. S. Gelsema and L. N. Kanal, eds., North-Holland, 1986, pp. 425–437.
Google Scholar
L. Kaufman and P.J. Rousseeuw, Clustering by means of medoids, in Statistical Data Analysis Based on the l1 Norm, Y. Dodge, ed., 1987, pp. 405–416.
Google Scholar
L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, 9th ed., John Wiley & Sons, 1990. doi: 10.1002/9780470316801.
Google Scholar
M. Kretowski, Evolutionary Decision Trees in Large-Scale Data Mining, Springer, 2019.
Google Scholar
P. Kulczycki and M. Charytanowicz, A complete gradient clustering algorithm formed with kernel estimators, Int. J. Appl. Math. Comput. Sci. 20 (2010), pp. 123–134. doi: 10.2478/v10006-010-0009-3.
Web of Science ®Google Scholar
B. Liu, Y. Xia, and P.S. Yu, Clustering through decision tree construction, in Proceedings of the Ninth International Conference on Information and Knowledge Management, 2000, pp. 20–29.
Google Scholar
S. Łukasik, Parallel computing of kernel density estimates with MPI, in International Conference on Computational Science, Springer, 2007, pp. 726–733.
Google Scholar
J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, Oakland, CA, USA, 1967, pp. 281–297.
Google Scholar
O.Z. Maimon and L. Rokach, Data Mining with Decision Trees: Theory and Applications, Vol. 81, World Scientific, 2014.
Google Scholar
L.C. Matioli, S. Santos, M. Kleina, and E.A. Leite, A new algorithm for clustering based on kernel density estimation, J. Appl. Stat. 45 (2017), pp. 347–366. doi: 10.1080/02664763.2016.1277191.
Web of Science ®Google Scholar
G. Menardi and A. Azzalini, An advancement in clustering via nonparametric density estimation, Stat. Comput. 24 (2014), pp. 753–767. doi: 10.1007/s11222-013-9400-x.
Web of Science ®Google Scholar
R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2019. Available at https://www.R-project.org/.
Google Scholar
P. Ram and A.G. Gray, Density estimation trees, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 627–635.
Google Scholar
A. Rodriguez and A. Laio, Clustering by fast search and find of density peaks, Science 344 (2014), pp. 1492–1496. doi: 10.1126/science.1242072.
PubMed Web of Science ®Google Scholar
P.J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20 (1987), pp. 53–65. doi: 10.1016/0377-0427(87)90125-7.
Web of Science ®Google Scholar
D. Scaldelai, L. C. Matioli, S. R. Santos and M. Kleina, MulticlusterKDE: a new algorithm for clustering based on multivariate kernel density estimation, Journal of Applied Statistics, 49 (2022), pp. 98–121. doi:10.1080/02664763.2020.1799958
Google Scholar
D.W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization, 2nd ed., John Wiley & Sons, 2015.
Google Scholar
L. Scrucca, M. Fop, T.B. Murphy, and A.E. Raftery, mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models, R. J. 8 (2016), pp. 289–317. doi: 10.32614/RJ-2016-021.
PubMedGoogle Scholar
R. Sibson, Slink: An optimally efficient algorithm for the single-link cluster method, Comput. J. 16 (1973), pp. 30–34. doi: 10.1093/comjnl/16.1.30.
Web of Science ®Google Scholar
B.W. Silverman, Density Estimation for Statistics and Data Analysis, Vol. 26, CRC Press, 1986.
Google Scholar
P. Smyth, A. Gray, and U.M. Fayyad, Retrofitting decision tree classifiers using kernel density estimation, in Machine Learning Proceedings 1995, Elsevier, 1995, pp. 506–514.
Google Scholar
W.W. Sun and L. Li, Dynamic tensor clustering, J. Am. Stat. Assoc. 114 (2019), pp. 1894–1907. doi: 10.1080/01621459.2018.1527701.
Web of Science ®Google Scholar
M.P. Wand and M.C. Jones, Kernel Smoothing, Chapman and Hall/CRC, 1994.
Google Scholar
W. Wang, J. Yang, and R. Muntz, STING: A statistical information grid approach to spatial data mining, in VLDB, Vol. 97, 1997, pp. 186–195.
Google Scholar
I.H. Witten and E. Frank, Data mining: Practical machine learning tools and techniques with java implementations, Acm Sigmod Rec. 31 (2002), pp. 76–77.
Google Scholar
J. Wu, Z. Lin, and H. Zha, Essential tensor learning for multi-view spectral clustering, IEEE. Trans. Image. Process. 28 (2019), pp. 5910–5922. doi: 10.1109/TIP.2019.2916740.
PubMed Web of Science ®Google Scholar
J. Xie, H. Gao, W. Xie, X. Liu, and P.W. Grant, Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors, Inf. Sci. 354 (2016), pp. 19–40. doi: 10.1016/j.ins.2016.03.011.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimation

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimation

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date