179
Views
3
CrossRef citations to date
0
Altmetric
Original Articles

Machine learning classifiers do not improve the prediction of academic risk: Evidence from Australia

&
Pages 228-246 | Published online: 17 Apr 2020
 

Abstract

Machine learning methods tend to outperform traditional statistical models at prediction. In the prediction of academic achievement, ML models have not shown substantial improvement over logistic regression. So far, these results have almost entirely focused on college achievement, due to the availability of administrative datasets, and have contained relatively small sample sizes by ML standards. In this article, we apply popular machine learning models to a large dataset (n = 1.2 million) containing primary and middle school performance on a standardized test given annually to Australian students. We show that machine learning models do not outperform logistic regression for detecting students who will perform in the “below standard” band of achievement upon sitting their next test, even in a large-n setting.

Acknowledgments

Thanks must be given to the Australian Curriculum, Assessment and Reporting Authority for the provision of the data utilized by this study. The authors would like to thank Foivos Diakogiannis and Airong Zhang for their helpful comments and suggestions.

Notes

1 See Shingari, Kumar, and Khetan (Citation2017) for a recent review.

2 See nap.edu.au

3 Samples of these tests may be found at https://www.nap.edu.au/naplan/the-tests.

4 nap.edu.au/results-and-reports/how-to-interpret/score-equivalence-tables

5 In addition to these classifiers, other common choices include k-nearest neighbors and support vector machines. However, computation of these classifiers does not scale well with number of predictors and sample size, respectively, and so were omitted from this study.

6 The mixing parameter, α, may also be tuned. However, this option is not included in the glmnet package. We re-estimated the classifiers with α=0.1 and α=0.9 with no change to the results.

7 We performed the analysis with the two previous NAPLAN achievement variables as numerical scores rather than dummies for “at standard” or “below standard” achievement with no change to the results.

8 Employment status of the parents is time varying, but is only collected at the time of enrolment.

Additional information

Funding

This research is supported by an Australian Government Research Training Program (RTP) Scholarship.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 353.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.