342
Views
2
CrossRef citations to date
0
Altmetric
Educational Case Reports

A Multi-institutional Study of the Feasibility and Reliability of the Implementation of Constructed Response Exam Questions

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon show all
Pages 609-622 | Received 14 Sep 2021, Accepted 27 Jul 2022, Published online: 20 Aug 2022

References

  • Bird JB, Olvet DM, Willey JM, Brenner J. Patients don’t come with multiple choice options: essay-based assessment in UME. Med Educ Online. 2019;24(1):1649959. doi:10.1080/10872981.2019.1649959.
  • Hauer KE, Boscardin C, Brenner JM, van Schaik SM, Papp KK. Twelve tips for assessing medical knowledge with open-ended questions: Designing constructed response examinations in medical education. Med Teach. 2020;42(8):880–885. doi:10.1080/0142159X.2019.1629404.
  • Wlodarczyk S, Muller-Juge V, Hauer KE, Tong MS, Ransohoff A, Boscardin C. Assessment to optimize learning strategies: a qualitative study of student and faculty perceptions. Teach Learn Med. 2021;33(1):1–21.
  • Piaget J. Structuralism. New York: Basic Books; 1970.
  • Piaget J. The Equilibration of Cognitive Structures. Chicago: University of Chicago Press; 1977.
  • Vygotsky LS. Mind in Society: The Development of Higher Psychological Processes. Cambridge, MA: Harvard University Press; 1978. [Database][Mismatch
  • Karpicke JD, Roediger3rd. HL. The critical importance of retrieval for learning. Science. 2008;319(5865):966–968. doi:10.1126/science.1152408.
  • Newble DI, Jaeger K. The effect of assessments and examinations on the learning of medical students. Med Educ. 1983;17(3):165–171. doi:10.1111/j.1365-2923.1983.tb00657.x.
  • Cilliers FJ, Schuwirth LW, Herman N, Adendorff HJ, van der Vleuten CP. A model of the pre-assessment learning effects of summative assessment in medical education. Adv Health Sci Educ Theory Pract. 2012;17(1):39–53. doi:10.1007/s10459-011-9292-5.
  • Scouller K. The influence of assessment method on students’ learning approaches: Mulitple choice question examination versus assignment essay. Higher Education. 1998;35(4):453–472. doi:10.1023/A:1003196224280.
  • Stranger-Hall KF. Multiple-choice exams: an obstacle for higher-level thinking in introductory science classes. CBE-Life Sci Educ. 2012;11(3):294–306.
  • Larsen DP, Butler AC, Roediger HL.3rd. Test-enhanced learning in medical education. Med Educ. 2008;42(10):959–966. doi:10.1111/j.1365-29232008.03124.x.
  • McConnell MM, St-Onge C, Young ME. The benefits of testing for learning on later performance. Adv in Health Sci Educ. 2015;20(2):305–320. doi:10.1007/s10459-014-9529-1.
  • McDaniel MA, Roediger3rdHL, McDermott KB. Generalising test-enhanced learning from the laboratory to the classroom. Psychon Bull Rev. 2007;14(2):200–206. doi:10.3758/bf03194052.
  • Wood T. Assessment not only drives learning, it may also help learning. Med Educ. 2009;43(1):5–6. doi:10.1111/j.1365-2923.2008.03237.x.
  • Black P, Wiliam D. Developing the theory of formative assessment. Educ Asse Eval Acc. 2009;21(1):5–31. doi:10.1007/s11092-008-9068-5.
  • Hubbard JK, Potts MA, Couch BA. How question types reveal student thinking: An experimental comparison of multiple-true-false and free-response formats. LSE. 2017;16(2):ar26. doi:10.1187/cbe.16-12-0339.
  • Boscardin CK, Earnest G, Hauer KE. Predicting performance on clerkship examinations and USMLE Step 1: What is the value of open-ended question examination? Acad Med. 2020;95(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 59th Annual Research in Medical Education Presentations):S109–S113. (11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 59th Annual Research in Medical Education Presentations). doi:10.1097/ACM.0000000000003629.
  • Sam AH, Field SM, Collares CF, et al. Very-short-answer questions: reliability, discrimination and acceptability. Med Educ. 2018;52(4):447–455. doi:10.1111/medu.13504.
  • Lukhele R, Thissen D, Wainer H. On the relative value of multiple-choice, constructed response, and examinee-selected items on two achievement tests. J Educational Measurement. 1994;31(3):234–250. doi:10.1111/j.1745-3984.1994.tb00445.x.
  • Wainer H, Thissen D. Combining multiple-choice and constructed response test scores: toward a Marxist Theory of test construction. Appl Measurement in Educ. 1993;6(2):103–118. doi:10.1207/s15324818ame0602_1.
  • Case SM, Swanson DB. Extended-matching items: a practical alternative to free-response questions. Teaching and Learning in Medicine: An Int J. 1993;5(2):107–115. doi:10.1080/10401339309539601.
  • Bierer SB, Taylor CA, Dannefer EF. Evaluation of essay questions used to assess medical students’ application and integration of basic and clinical science knowledge. Teach Learn Med. 2009;21(4):344–350. doi:10.1080/10401330903230980.
  • Palmer EJ, Duggan P, Devitt PG, Russell R. The modified essay question: its exit from the exit examination? Med Teach. 2010;32(7):e300-307–e307. doi:10.3109/0142159X.2010.488705.
  • Khan JS, Mukhtar O, Tabasum S, et al. Relationship of awards in multiple choice questions and structured answer questions in the undergraduate years and their effectiveness in evaluation. J Ayub Med Coll Abbottabad. 2010;22(2):191–195.
  • Wood EJ. What are extended matching sets questions? Bioscience Educ. 2003;1(1):1–8. doi:10.3108/beej.2003.01010002.
  • Hodges B. Assessment in the post-psychometric era: learning to love the subjective and collective. Med Teach. 2013;35(7):564–568. doi:10.3109/0142159X.2013.789134.
  • Feletti GI, Smith EK. Modified essay questions: are they worth the effort? Med Educ. 1986;20(2):126–132. doi:10.1111/j.1365-2923.1986.tb01059.x.
  • Hift RJ. Should essays and other "open-ended"-type questions retain a place in written summative assessment in clinical medicine? BMC Med Educ. 2014;14:249. doi:10.1186/s12909-014-0249-2.
  • Nunnally JC, Bernstein IR. Psychometric Theory. 3rd ed. New York: MacGraw-Hill; 1994.
  • van der Vleuten CP, Schuwirth LW. Assessing professional competence: from methods to programmes. Med Educ. 2005;39(3):309–317. doi:10.1111/j.1365-2929.2005.02094.x.
  • Dobkin BH. Progressive staging of pilot studies to improve phase III trials for motor interventions. Neurorehabil Neural Repair. 2009;23(3):197–206. doi:10.1177/1545968309331863.
  • Schuwirth LW, van der Vleuten CP. Different written assessment methods: what can be said about their strengths and weaknesses? Med Educ. 2004;38(9):974–979. doi:10.1111/j.1365-2929.2004.01916.x.
  • Wakeford RE, Roberts S. Short answer questions in an undergraduate qualifying examination: a study of examiner variability. Med Educ. 1984;18(3):168–173. doi:10.1111/j.1365-2923.1984.tb00999.x.
  • Williams R, Sanford J, Stratford PW, Newman A. Grading written essays: a reliability study. Phys Ther. 1991;71(9):679–686. doi:10.1093/ptj/71.9.679.
  • Newstead S, Dennis I. The reliability of exam marking in psychology: examiners examined. Psychologist. 1994;7:216–219.
  • Norcini JJ, Shea JA. The effect of level of expertise on answer key development. Acad Med. 1990;65(9 Suppl):S15–S16.
  • Smith B, Sinclair H, Simpson J, Van Teijlingen E, Bond C, Taylor R. What is the role of double-marking? Evidence from an undergraduate medical course. Educ Primary Care. 2002;13:497–503.
  • Yune SJ, Lee SY, Im SJ, Kam BS, Baek SY. Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students. BMC Med Educ. 2018;18(1):124. doi:10.1186/s12909-018-1228-9.
  • Ferguson KJ. Beyond multiple-choice questions: using case-based learning patient questions to assess clinical reasoning. Med Educ. 2006;40(11):1143–1143. doi:10.1111/j.1365-2929.2006.02592.x.
  • Hussein MA, Hassan H, Nassef M. Automated language essay scoring systems: a literature review. PeerJ Comput Sci. 2019;5:e208. doi:10.7717/peerj-cs.208.
  • Ginzburg S, Brenner J, Willey J. Integration: a strategy for turning knowledge into action. MedSciEduc. 2015;25(4):533–543. doi:10.1007/s40670-015-0174-y.
  • Thomas PA, Wilson-Delfosse AL, Mehta N, Papp KK, Bierer SB, Isaacson JH. Case Western Reserve University School of Medicine, Including the Cleveland Clinic Lerner College of Medicine. Acad Med. 2020;95(9S A Snapshot of Medical Student Education in the United States and Canada: Reports From 145 Schools):S396–S401. doi:10.1097/ACM.000000000003411.
  • Lucey CR, Hauer K, O’Sullivan P, Poncelet A, Souza KH, Davis J. University of California, San Francisco School of Medicine. Acad Med. 2020;95(9S A Snapshot of Medical Student Education in the United States and Canada: Reports From 145 Schools):S70–S73. doi:10.1097/ACM.0000000000003469.
  • Kruidering M, Wlodarczyk S, Boscardin C, van Schaik S, Fulton T. Bringing your eam questions to Bloom: writing effective open-ended questions to test higher-level thinking. Denver, CO: Western Group on Educational Affairs (WGEA); 2018.
  • Jonsson A, Svingby G. The use of scoring rubrics: reliability, validity, and educational consequences. Educ Res Rev. 2007;2(2):130–144. doi:10.1016/j.edurev.2007.05.002.
  • Maxwell JA, Miller BA. Categorizing and connecting strategies in qualitative data analysis. In: Hesse-Biber SN, Leavy P, eds. Handbook of Emergent Methods. New York: The Guilford Press; 2008:461–476.
  • Hsieh HF, Shannon SE. Three approaches to qualitative content analysis. Qual Health Res. 2005;15(9):1277–1288. doi:10.1177/1049732305276687.
  • Bowen DJ, Kreuter M, Spring B, et al. How we design feasibility studies. Am J Prev Med. 2009;36(5):452–457. doi:10.1016/j.amepre.2009.02.002.
  • Frambach JM, van der Vleuten CP, Durning SJ. AM last page. Quality criteria in qualitative and quantitative research. Acad Med. 2013;88(4):552.
  • Hanson JL, Balmer DF, Giardino AP. Qualitative research methods for medical educators. Acad Pediatr. 2011;11(5):375–386. doi:10.1016/j.acap.2011.05.001.
  • Cohen J. Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol Bull. 1968;70(4):213–220. doi:10.1037/h0026256.
  • Bujang MA, Baharum N. Guidelines of the minimum sample size requirements for Cohen’s Kappa. Epidemiol Biostat Public Health. 2017;14(2):e12267-12261–e12267-12210.
  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174.
  • Brauer DG, Ferguson KJ. The integrated curriculum in medical education: AMEE Guide No. 96. Med Teach. 2015;37(4):312–322. doi:10.3109/0142159X.2014.970998.
  • Achike FI. The challenges of integration in an innovative modern medical curriculum. MedSciEduc. 2016;26(1):153–158. doi:10.1007/s40670-015-0206-7.
  • Dufresne RJ, Leonard WJ, Gerace WJ. Making sense of students’ answers to multiple-choice questions. The Physics Teacher. 2002;40(3):174–180. doi:10.1119/1.1466554.
  • Rudolph MJ, Daugherty KK, Ray ME, Shuford VP, Lebovitz L, DiVall MV. Best practices related to examination item construction and post-hoc review. Am J Pharm Educ. 2019;83(7):7204. doi:10.5688/ajpe7204.
  • Hogan TP, Murphy G. Recommendations for preparing and scoring constructed-response items: what the experts say. Appl Measurement in Education. 2007;20(4):427–441. doi:10.1080/08957340701580736.
  • Shilo G. Formulating good open-ended questions in assessment. Educ Res Quarterly. 2015;38(4):3–30.
  • Norcini J, Anderson B, Bollela V, et al. Criteria for good assessment: consensus statement and recommendations from the Ottawa 2010 Conference. Med Teach. 2011;33(3):206–214. doi:10.3109/0142159X.2011.551559.
  • Kogan JR, Hauer KE. Sparking change: How a shift to step 1 pass/fail scoring could promote the educational and catalytic effects of assessment in medical education. Acad Med. 2020;95(9):1315–1317. doi:10.1097/ACM.0000000000003515.
  • Slater SC, Boulet JR. Predicting holistic ratings of written performance assessments from analytic scoring. Adv Health Sci Educ Theory Pract. 2001;6(2):103–119.
  • Veal LR, Hudson SA. Direct and indirect measures for large-scale evaluation of writing. Res Teaching of English. 1983;17(3):290–296.
  • Goulden NR. Relationship of analytic and holistic methods to raters’ scores for speeches. J Res Devel Educ. 1994;27(2):73–82.
  • Jescovitch LN, Scott EE, Cerchiara JA, et al. Deconstruction of holistic rubrics into analytic rubrics for large-scale assessments of students’ reasoning of complex science concepts. Practical Assessment, Res Eval. 2019;24(1):7.
  • van Loon KA, Driessen EW, Teunissen PW, Scheele F. Experiences with EPAs, potential benefits and pitfalls. Med Teach. 2014;36(8):698–702. doi:10.3109/0142159X.2014.909588.
  • Young C. Initiating self-assessment strategies in novice physiotherapy students: a method case study. Assessment & Eval Higher Educ. 2013;38(8):998–1011. doi:10.1080/02602938.2013.771255.
  • Tomas C, Whitt E, Lavelle-Hill R, Severn K. Modeling holistic marks with analytic rubrics. Front Educ. 2019;4:89. doi:10.3389/feduc.2019.00089.
  • Berger AJ, Gillespie CC, Tewksbury LR, et al. Assessment of medical student clinical reasoning by "lay" vs physician raters: inter-rater reliability using a scoring guide in a multidisciplinary objective structured clinical examination. Am J Surg. 2012;203(1):81–86. doi:10.1016/j.amjsurg.2011.08.003.
  • Yudkowsky R, Hyderi A, Holden J, et al. Can nonclinician raters be trained to assess clinical reasoning in postencounter patient notes? Acad Med. 2019;94(11S Association of American Medical Colleges Learn Serve Lead: Proceedings of the 58th Annual Research in Medical Education Sessions):S21–S27. doi:10.1097/ACM.0000000000002904.
  • Russ S, Hull L, Rout S, Vincent C, Darzi A, Sevdalis N. Observational teamwork assessment for surgery: feasibility of clinical and nonclinical assessor calibration with short-term training. Ann Surg. 2012;255(4):804–809. doi:10.1097/SLA.0b013e31824a9a02.
  • Ahmad SE, Farina GA, Fornari A, Pearlman RE, Friedman K, Olvet DM. Student perception of case-based teaching by near-peers and faculty during the internal medicine clerkship: A noninferiority study. J Med Educ Curric Dev. 2021;8:23821205211020762. doi:10.1177/23821205211020762.
  • Loda T, Erschens R, Nikendei C, Zipfel S, Herrmann-Werner A. Qualitative analysis of cognitive and social congruence in peer-assisted learning - The perspectives of medical students, student tutors and lecturers. Med Educ Online. 2020;25(1):1801306.
  • Shumway JM, Harden RM, Association for Medical Education in Europe. AMEE Guide No. 25: The assessment of learning outcomes for the competent and reflective physician. Med Teach. 2003;25(6):569–584. doi:10.1080/0142159032000151907.
  • Van der Vleuten CP. The assessment of professional competence: developments, research and practical implications. Adv Health Sci Educ Theory Pract. 1996;1(1):41–67. doi:10.1007/BF00596229.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.