Abstract
The objective of the present study was to investigate item-, examinee-, and country-level correlates of rapid guessing (RG) in the context of the 2018 PISA science assessment. Analyzing data from 267,148 examinees across 71 countries showed that over 50% of examinees engaged in RG on an average proportion of one in 10 items. Descriptive differences were noted between countries on the mean number of RG responses per examinee with discrepancies as large as 500%. Country-level differences in the odds of engaging in RG were associated with mean performance and regional membership. Furthermore, based on a two-level cross-classified hierarchical linear model, both item- and examinee-level correlates were found to moderate the likelihood of RG. Specifically, the inclusion of items with multimedia content was associated with a decrease in RG. A number of demographic and attitudinal examinee-level variables were also significant moderators, including sex, linguistic background, SES, and self-rated reading comprehension, motivation mastery, and fear of failure. The findings from this study imply that select subgroup comparisons within and across nations may be biased by differential test-taking effort. To mitigate RG in international assessments, future test developers may look to leverage technology-enhanced items.
Author contribution
The First Author Conceived Of The Presented Idea, Identified The Datasets, Conducted Descriptive Analyses, And Drafted The Majority Of The Article. The Second Author Ran The Hierarchical Model Analyses And Assisted With Writing The Method Section. All Authors Interpreted Findings And Conducted Critical Revisions Of The Article Throughout The Review Process. Final Approval Of The Version To Be Published Was Made By All Authors.
Notes
1 Although there are multiple approaches that have been utilized to operationalize potential low test-taking effort, such as self-reported diligence, multivariate outlier analysis, and response pattern inconsistencies (see Meade & Craig, Citation2012), this study utilized rapid guessing as it is the most used proxy employed for multiple-choice cognitive assessments, which is the focus of this study (see Silm et al., Citation2020).
2 In 2020, PISA uploaded an additional file, cognitive items total time/visits data file, which includes a testing time variable based on the total number of item visits (i.e., examinees were allowed to visit an item multiple times). However, analyses demonstrated that in 93% of cases (n = 47,278,588 item visits), examinees visited items only once. Further analyses demonstrated a nearly identical median response time per item distribution when calculated based on examinee’s last visit and their total visits for a given item (see https://www.oecd.org/pisa/data/pisa2018technicalreport/PISA2018-TechReport-Annex-K.pdf). Thus, our utilization of response times based on the last item visit likely had minimal influence on the presented results.
3 Although Wise and Ma (2012) proposed establishing an upper-bound threshold of 10 seconds when the mean item response exceeded 10 seconds, no such threshold was utilized in this study. Rapid responses were simply defined as 10% of the mean item response time for each individual country.