76
Views
0
CrossRef citations to date
0
Altmetric
Research Article

On impurity functions in decision trees

Received 02 Aug 2022, Accepted 06 Feb 2024, Published online: 04 Mar 2024

References

  • Basak, S., S. Agrawal, S. Saha, A. J. Theophilus, K. Bora, G. Deshpande, and J. Murthy. 2018. Habitability classification of exoplanets: a machine learning insight. arXiv:1805.08810.
  • Ben-Tal, A., and A. Nemirovski. 2013. Optimization III: Convex analysis, nonlinear programming theory, nonlinear programming algorithms. Lecture notes. Georgia Institute of Technology. Accessed July 10, 2023. http://www2.isye.gatech.edu/∼nemirovs/OPTIII_LectureNotes.pdf.
  • Berk, R. A. 2008. Classification and regression trees (CART). In Statistical learning from a regression perspective. Springer Series in Statistics, ed. P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, and S. Zeger, 103–67. New York: Springer.
  • Breiman, L. 1996. Technical note: Some properties of splitting criteria. Machine Learning 24 (1):41–7. doi: 10.1007/BF00117831.
  • Breiman, L., J. H. Friedman, R. A. Olshen, and C. I. Stone. 1984. Classification and regression trees. Belmont, CA: Wadsworth International Group.
  • Chang, L. Y., and J. T. Chien. 2013. Analysis of driver injury severity in truck involved accidents using a non-parametric classification tree model. Safety Science 51 (1):17–22. doi: 10.1016/j.ssci.2012.06.017.
  • Coppersmith, D., S. J. Hong, and J. R. M. Hosking. 1999. Partitioning nominal attributes in decision trees. Data Mining and Knowledge Discovery 3:197–217.
  • Cover, T. M., and J. A. Thomas. 2006. Elements of information theory. 2nd ed. New York, NY: John Wiley & Sons.
  • Florenzano, M., and C. Le Van. 2001. Convex functions. In Finite dimensional convexity and optimization. Studies in economic theory, ed. N. C. Yannelis, T. J. Kehoe, and B. Cornet, 73–86. Berlin, Heidelberg: Springer.
  • Jauregui, A. F. n.d. How to program a decision tree in Python from 0. https://anderfernandez.com/en/blog/code-decision-tree-python-from-scratch/ (accessed August 18, 2023).
  • Kass, G. V. 1980. An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29 (2):119–27. doi: 10.2307/2986296.
  • Kearns, M., and Y. Mansour. 1999. On the boosting ability of top-down decision tree learning algorithms. Journal of Computer and System Sciences 58 (1):109–28. doi: 10.1006/jcss.1997.1543.
  • Kozlik, A. 1941. Note on terminology convex and concave. The American Economic Review 31 (1):103–5.
  • Li, K. Y. 1999. Chebyshev inequality. Mathematical Excalibur 4 (3): 1–4.
  • Lichman, M. 2013. UCI machine learning repository. https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data (accessed August 18, 2023).
  • Maszczyk, T., and W. Duch. 2008. Comparison of Shannon, Renyi and Tsallis entropy used in decision trees. In ICAISC 2008: Proceedings of 2008 International Conference on Artificial Intelligence and Soft Computing. Lecture Notes in Computer Science, ed. L. Rutkowski, R. Tadeusiewicz, L. A. Zadeh, and J. M. Zurada, vol. 5097, 643–51. Berlin, Heidelberg: Springer.
  • McEliece, R. J. 2002. The theory of information and coding. In Encyclopedia of Mathematics and its Applications, ed. G.-C. Rota, R. Doran, M. Ismail, T.-Y. Lam, and E. Lutwak, vol. 86, 2nd ed. Cambridge, UK: Cambridge University Press.
  • Niculescu, C. P., and L. E. Persson. 2006. Convex functions and their applications: A contemporary approach. Berlin, Heidelberg: Springer.
  • Quinlan, J. R. 1986. Induction of decision trees. Machine Learning 1 (1):81–106. doi: 10.1007/BF00116251.
  • Quinlan, J. R. 1993. C4.5: Programs for machine learning. San Francisco, CA: Morgan Kaufmann.
  • Raileanu, L., and K. Stoffel. 2004. Theoretical comparison between the Gini index and information gain criteria. Annals of Mathematics and Artificial Intelligence 41 (1):77–93. doi: 10.1023/B:AMAI.0000018580.96245.c6.
  • Rockafellar, R. T. 1972. Convex analysis. Princeton, NJ: Princeton University Press.
  • Rokach, L., and O. Maimon. 2005. Chapter 9: “Decision Trees.” In Data mining and knowledge discovery handbook, ed. L. Rokach, and O. Maimon, 165–92. New York: Springer.
  • Rokach, L., and O. Maimon. 2008. Data Mining with Decision Trees: Theory and Applications. Singapore: World Scientific Publishing Company.
  • Shukla, P., I. Basu, and D. Tuninetti. 2014. Towards closed-loop deep brain stimulation: Decision tree-based essential tremor patient’s state classifier and tremor reappearance predictor. In EMBC 2014: Proceedings of 2014 36th Annual International Conference of the IEEE engineering in medicine and biology society, 2605–8. Piscataway, New Jersey: IEEE.
  • Stamate, D., W. Alghamdi, D. Stahl, D. Logofatu, and A. Zamyatin. 2018. PIDT: A novel decision tree algorithm based on parameterised impurities and statistical pruning approaches. 2018 IFIP Advances in Information and Communication Technology 519:273–84.
  • Yu, P. L. H., W. Wan, and P. Lee. 2008. Analyzing ranking data using decision tree. In Proceedings of the ECML/PKDD’08 workshop on preference learning, ed. E. Huellermeier, and J. Fuernkranz, 139–56. Berlin, Heidelberg: Springer.
  • Zaman, Q., M. Azam, M. Iqbal, S. Khusro, and K. P. Pfeiffer. 2011. Classification trees using new criteria for two or more categories. Elixir Bio Tech 40:5430–2.
  • Zeng, G. 2019. On the confusion matrix in credit scoring and its analytical properties. Communications in Statistics - Theory and Methods 49 (9):2080–93. doi: 10.1080/03610926.2019.1568485.
  • Zhong, M., M. Georgiopoulos, G. Anagnostopoulos, and M. Mollaghasemi. 2006. Generalized entropy for splitting on numerical attributes in decision trees. In FLAIRS 2006: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference, ed. Sutcliffe, G. C. J., and R. G. Goebel, 604–9. Menlo Park, California: AAAI Press.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.