Abstract
Introduction: Borderline regression (BRM) is considered problematic in small cohort OSCEs (e.g. n < 50), with institutions often relying on item-centred standard setting approaches which can be resource intensive and lack defensibility in performance tests.
Methods: Through an analysis of post-hoc station- and test-level metrics, we investigate the application of BRM in three different small-cohort OSCE contexts: the exam for international medical graduates wanting to practice in the UK, senior sequential undergraduate exams, and Physician associates exams in a large UK medical school.
Results: We find that BRM provides robust metrics and concomitantly defensible cut scores in the majority of stations (percentage of problematic stations 5, 14, and 12%, respectively across our three contexts). Where problems occur, this is generally due to an insufficiently strong relationship between global grades and checklist scores to be confident in the standard set by BRM in these stations.
Conclusion: This work challenges previous assumptions about the application of BRM in small test cohorts. Where there is sufficient spread of ability, BRM will generally provide defensible standards, assuming careful design of station-level scoring instruments. However, extant station cut-scores are preferred as a substitute where BRM standard setting problems do occur.
Ethical approval
The University of Leeds and the GMC gave permission for the anonymized data to be used for research in this paper. The co-chairs of the University of Leeds School of Medicine ethics committee confirmed to the authors that formal ethics approval for this study was not required as it involved the use of routinely collected student assessment data which were fully anonymized prior to analysis.
Acknowledgments
We thank the GMC for providing access to the anonymized data used as part of this study. We also thank our friend and colleague, the late John Patterson, for prompting the initial research on small cohorts in 2012 which eventually led to this work.
Disclosure statement
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the article.
Glossary
Borderline regression method: This is an examinee-centred method of standard setting often used in OSCEs. At the station level, candidates are scored in two ways independently – one score is based on a checklist or set of domain scores, and the other is a global grading of performance (e.g. fail, borderline, pass, good grade). Scores are regressed on grades, and the cut-score in the station is set at the checklist/domain score corresponding to the borderline grade (Pell Citation2010). The overall test cut-score is set at the aggregate of the stations cut-scores. One advantage the borderline regression method is that it uses all scores from the assessment (e.g. not just those at the borderline), and that these scores are based on judgment of the actual performance of candidates – compare with item-centred standard setting methods (e.g. Angoff) where item difficulty is judged in advance of the administration of the assessment.
Pell G, Fuller R, Homer M, Roberts T. 2010. How to measure the quality of the OSCE: a review of metrics – AMEE Guide No. 49. Med Teach. 32(10):802–811.
Additional information
Notes on contributors
Matt Homer
Matt Homer, BSc, MSc, PhD, PGCE, CStat, is an Associate Professor in the Schools of Education and Medicine at the University of Leeds. Within medical education, he has a research interest in assessment design, standard setting methodologies and psychometrics analysis. He also advises the UK General Medical Council on a range of assessment issues.
Richard Fuller
Richard Fuller, MA, MBChB, FRCP, is a Consultant Geriatrician/Stroke Physician and Vice-Dean of the School of Medicine at the University of Liverpool. His current research focuses on the application of intelligent assessment design in campus and workplace-based assessment formats, assessor behaviours, mobile technology delivered assessment and the impact of sequential testing methodologies.
Jennifer Hallam
Jennifer Hallam, BSc, PG Dip, MSc, PhD, is an Educational psychometrician in the School of Medicine, University of Leeds. Her current interests include the strategic development of assessment and feedback strategies, specifically for performance based assessments. She also has several national medical education roles which include being on the Board of Directors for the Association for the Study of Medical Education (ASME).
Godfrey Pell
Godfrey Pell, BEng, MSc, CStat, is a principal research fellow emeritus at Leeds Institute of Medical Education, who has a strong background in management. His research focuses on quality within the OSCE, including theoretical and practical applications. He acts as an assessment consultant to a number of medical schools.