ABSTRACT
Regression Tree Method is not yet a mainstream method in Education, despite of being a traditional approach in Machine Learning. We advocate that this method should become mainstream in Education, since, in our point of view, it is the most suitable method to analyse complex datasets, very common in Education. This is, for example, the case of educational governmental large-scale databases, in particular those where the information: (1) have large quantity and types of variables; (2) exhibit many categorical variables with many categories; (3) have many non-linear relationships among variables; (4) are guided or supported by management goals, instead of a specific theory. In this paper we show its rationale, focusing on the Classification And Regression Trees algorithm (CART). We also apply this algorithm to a complex large-scale educational dataset, the microdata of the National Examination for Secondary Education (Exame Nacional do Ensino Médio [ENEM]). Our general goal is to disseminate the use of the Regression Tree Method in Education, particularly in complex datasets and on the substantial and interpretative aspects of this method.
Acknowledgements
Cristiano Mauro Assis Gomes: Productivity Fellowship, CNPq Brazil. The authors wish to thank Professor Ivan Bezerra Allaman for helping in getting some figures according to the journaĺs standards.
Disclosure Statement
No potential conflict of interest was reported by the authors.
ORCID
Cristiano Mauro Assis Gomes http://orcid.org/0000-0003-3939-5807
Enio Jelihovschi http://orcid.org/0000-0002-7286-1198