Abstract
Introduction: Examiner based variance can affect test taker outcomes. The aim of this study was to investigate the examiner-based effect of DRIFT or differential rater functioning over time.
Methods: Average station level scores from five administrations of the same version of a highstakes 12-station OSCE were analyzed for the presence of DRIFT.
Results: Test-takers who were scored earlier appeared to receive a score advantage, while those who were scored later, appeared to receive neither a score advantage nor disadvantage due to the DRIFT behavior. A specific form of DRIFT, primacy (the assignment of progressively harsher scores), was present in one out of the 228 examiner scoring opportunities investigated in this study. In other words, less than 1% of the examiner scoring that took place displayed significant levels of DRIFT scoring behavior.
Discussion and Conclusions: The noted score advantage influenced the test outcomes of only one examinee who performed close to the cut-score on all other stations. Prior publications report broader effects of DRIFT, but the current assessment context, particularly access to examiner training, may have had a modulating effect in the present study.
Disclosure statement
The authors report no conflict of interest. The authors alone are responsible for the content and writing of this article.
Glossary
Differential rater functioning over time or DRIFT: Is a specific order effect whereby examiners are tasked to assign scores to redefine the rubrics as they progress through the assessment.
Additional information
Notes on contributors
Karen Coetzee
Karen Coetzee, MSc, is a psychometrician at Touchstone Institute, an organization of competency evaluation experts in Toronto, Canada.
Sandra Monteiro
Sandra Monteiro, PhD, is an assistant professor in the Department of Health Research Methods, Evidence and Impact at McMaster University in Hamilton, Ontario, and director of Research and Analysis at Touchstone Institute.