ABSTRACT
The OECD’s Programme for International Student Assessment (PISA) has become one of the key studies for evidence-based education policymaking across the globe. PISA has however received a lot of methodological criticism, including how the test scores are created. The aim of this paper is to investigate the so-called ‘conditioning model’, where background variables are used to derive student achievement scores, and the impact it has upon the PISA results. This includes varying the background variables used within the conditioning model and analysing its impact upon countries relatively positions in the PISA rankings. Our key finding is that the exact specification of the conditioning model matters; cross-country comparisons of PISA scores can change depending upon the statistical methodology used.
Acknowledgments
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska Curie grant agreement no. 765400.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Supplementary material
Supplemental data for this article can be accessed online at https://doi.org/10.1080/0969594X.2022.2118665
Notes
1. Two of the mathematics item clusters exist in an ‘easy’ and a ‘standard’ version (clusters 6 and 7). Countries with a low expected performance can administer the easy versions instead of the standard versions. This leads to 13 booklets per country in either the easy or standard version, with an overlap of six booklets.
2. The common sample existed of 500 students from each country, expect for Liechtenstein, which were randomly selected (OECD, Citation2014a, p. 233).
3. As a result, the first and final part of the procedure described above will not be directly replicated in this paper. Rather, the officially published numbers (e.g. values of item difficulties) will be used instead.
4. In the MI literature, it is widely suggested that (in the presence of missing data) the relationship between a variable and the outcome of interest will be attenuated unless that variable is included in the imputation model. This idea is also applied within the conditioning modelling literature, with it being claimed that the relationship between students’ background characteristics and their achievement will be attenuated unless that variable is included in the conditioning model.
5. For the estimation of an IRT model, some assumptions need to be made. There are different approaches to enable the estimation. The approach involving the specification of a density for the latent variables is called the ‘marginal approach’ and is used in PISA.
6. By recoding, we mean altering and transforming the format of the variable without changing the meaning or value of the variables (e.g. contrast/dummy-coding of variables). By pre-processing, we mean altering and transforming the values of the variables (e.g. computing a new questionnaire index by averaging multiple variables or using principle components).
7. The contrast coding for booklets was further tweaked so that the information for students who only answered questions in two domains is based on information from all booklets that have items in a domain (OECD, Citation2014a, p. 157). Furthermore, the regression coefficients for booklets which covered two of three domains were set to zero for the third domain in the latent regression.
8. The exact details for all recoding can be found in Annexe B in the technical report (OECD, Citation2014a, pp. 421–431).
Additional information
Funding
Notes on contributors
Laura Raffaella Zieger
Laura Zieger completed her PhD at UCL
J. Jerrim
John Jerrim is a Professor at UCL.
J. Anders
Jake Anders is an Associate Professor at UCL.
N. Shure
Nikki Shure is an Associate Professor at UCL.