ABSTRACT
Linear logistic test models (LLTMs), leveraging item response theory and linear regression, offer an elegant method for learning about item characteristics in complex content areas. This study used LLTMs to model single-best-answer, multiple-choice-question response data from two medical subspecialty certification examinations in multiple years and found that word count, proportion of complex words, number of options (3- vs. 4-option), whether including an image, nature of the question task (identifying risks, diagnostic test, management), and whether including application context significantly predicted item difficulty in one or both of the Critical Care Medicine and Pediatric Anesthesiology exams. The differences in the item characteristics that were significant predictors of item difficulty and their associated coefficient estimates between the two exams suggest possible domain differences. This study highlights the possibilities and challenges of using LLTMs to identify item characteristics for complex assessments. The results may help inform or expedite item writing and reviewing processes.
Disclosure statement
Support was provided solely from institutional and/or departmental sources. Ann E. Harman, Huaping Sun, and Emily K. Toutkoushian are staff members of the American Board of Anesthesiology (ABA). Mark T. Keegan is an ABA Director and receives an honorarium for his participation in ABA activities.