ABSTRACT
This study aims to identify the main determinants of students’ performances in reading and maths across eight European Union countries (i.e. Austria, Croatia, Germany, Hungary, Italy, Portugal, Slovakia, and Slovenia). Based on student level data from the OECD-PISA 2018 survey and by means of the application of efficient algorithms, we highlight that the number of books at home or a variable combining the type and location of school represent the most important predictors of the students’ performance in all the analysed countries, while other school characteristics are rarely relevant. Econometric results show that students attending vocational schools perform significantly worse than those in general schools. Looking at differences between students attending schools in big cities and those in small cities, they are never statistically significant except in Portugal. Through the Gelbach decomposition method, which allows to measure the relative importance of observable characteristics to explain a gap, we show that the differences in test scores between big and small cities depend on the schools’ characteristics, while the differences between general and vocational schools are mainly explained by the families’ social status. Results appear robust to the hierarchical model approach.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Supplementary Material
Supplemental data for this article can be accessed online at https://doi.org/10.1080/00036846.2023.2170968
Data availability statement
Data are freely available here: https://www.oecd.org/pisa/.
Notes
1 For more details see: https://www.oecd.org/pisa/.
2 Starting by the sample containing all the European Union countries, we have excluded those countries where school systems establish that the tracking begins after 15 years old (source: https://eacea.ec.europa.eu/national-policies/eurydice/national-description_en) and those in which the maths or the reading performance of students still in lower secondary level is statistically different to the ones reported by those already in an upper secondary school.
3 To be clear, the selection of variables does not depend on the magnitude of their coefficients but on the relative importance they have in explaining the heterogeneity of the dependent variable.
4 To perform the best subsets variable selection, we used the gvselect Stata command created by Charles Lindsey and Simon Sheather.
5 The highest parental occupational status is measured by the HISEI variable. For a deeper explanation of its construction see: https://www.oecd-ilibrary.org/sites/0a428b07-en/index.html?itemId=/content/component/0a428b07-en.
6 The PISA survey dataset also provides a composite index which gives a synthetic information on all items owned by a household. Among the others, the number of books at home represents one of the items composing this index, thus the correlation existing between these two variables is strong. Moreover, once this composite index is included in the basket of predictors, results of efficient algorithms of predictors selection report it in best fitted models in one country only (Italy), while the number of books at home is preferred in all other cases. More details available upon request to the authors.
7 From the variables provided by the PISA survey dataset and somewhat related to the students’ performance, we exclude here having repeated almost a school year, the percentage of full professors, the percentage of qualified professors, the percentage of government expenditure, and the percentage of student fees because of their large extent of missing values.
8 To be noted, while in this application of the adopted methodology all best fitted models appear nested when the quantity of predictors increases, the same situation may not necessarily stand in other applications.
9 It has to be considered however that the two variables related to the presence of digital devices at home are expected to be somewhat correlated. For this reason, as an additional analysis, we replicated the econometric analysis showed in including in the model specification an interaction term between the ‘E-book/tablet at home’ variable and the ‘Computer at home’ one. Table A2 in Appendix shows that the coefficient of the ‘E-book/tablet at home’ variable is not biased due to the correlation with the ‘Computer at home’ variable, but the former stops to be significant in most cases once the interaction term is included. The co-presence of a computer and a tablet (or similar) at home appears having a significant and positive effect on students’ scores in Slovenia (reading) and Hungary (mathematics) only.
10 The effect of the ‘guidance’ variable appears here counterintuitive as this kind of services should improve rather than worse the students’ performances. Nonetheless, it is possible that these services are activated just in those schools where the students’ performances are known to be low. This common policy strategy would help explaining (at least partially) why students’ performances are lower on average in presence of guidance services.