1,241
Views
1
CrossRef citations to date
0
Altmetric
Articles

The blind side: Exploring item variance in PISA 2018 cognitive domains

ORCID Icon & ORCID Icon
Pages 332-360 | Received 10 Dec 2021, Accepted 14 Jun 2022, Published online: 17 Jul 2022
 

ABSTRACT

Communication of International Large-Scale Assessment (ILSA) results is dominated by reporting average country achievement scores that conceal individual differences between pupils, schools, and items. Educational research primarily focuses on examining differences between pupils and schools, while differences between items are overlooked. Using a variance components model on the Programme for International Student Assessment (PISA) 2018 cognitive domains of reading, mathematics, and science literacy, we estimated how much of the response variation can be attributed to differences between pupils, schools, and items. The results show that uniformly across domains and countries, it mattered more for the correctness of an item response which items were responded to by a pupil (27–35%) than which pupil responded to these items (10–12%) or which school the pupil attended (5–7%). Given the findings, we argue that differences between items in ILSAs constitute a source of substantial untapped potential for secondary research.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Kseniia Marcq

Kseniia Marcq received her master’s degree in Measurement, Assessment and Evaluation from the University of Oslo, Norway. She is currently a doctoral research fellow at the Centre for Educational Measurement at the University of Oslo, Norway. Her research uses exploratory and meta-analytical approaches to uncover untapped potential in the data of international large-scale assessments.

Johan Braeken

Johan Braeken is a professor of psychometrics at the Centre for Educational Measurement at the University of Oslo, Norway. His research interests are in latent variable modelling, modern test design including adaptive testing, and the information value and data quality in large-scale assessments.