Construct validity of multiple mini interviews – Investigating the role of stations, skills, and raters using Bayesian G-theory: Medical Teacher: Vol 42 , No 2

Sample our Medicine, Dentistry, Nursing & Allied Health journals, sign in here to start your FREE access for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/0142159X.2019.1670337?needAccess=true

Abstract

Background: One popular procedure in the medical student selection process are multiple mini-interviews (MMIs), which are designed to assess social skills (e.g., empathy) by means of brief interview and role-play stations. However, it remains unclear whether MMIs reliably measure desired social skills or rather general performance differences that do not depend on specific social skills. Here, we provide a detailed investigation into the construct validity of MMIs, including the identification and quantification of performance facets (social skill-specific performance, station-specific performance, general performance) and their relations with other selection measures.

Methods: We used data from three MMI samples (N = 376 applicants, 144 raters) that included six interview and role-play stations and multiple assessed social skills.

Results: Bayesian generalizability analyses show that, the largest amount of reliable MMI variance was accounted for by station-specific and general performance differences between applicants. Furthermore, there were low or no correlations with other selection measures.

Discussion: Our findings suggest that MMI ratings are less social skill-specific than originally conceptualized and are due more to general performance differences (across and within-stations). Future research should focus on the development of skill-specific MMI stations and on behavioral analyses on the extents to which performance differences are based on desirable skills versus undesired aspects.

Disclosure statement

The authors report no declarations of interest. We embrace the values of openness and transparency in science (www.researchtransparency.org), we therefore published all raw data necessary to reproduce the reported results and provide scripts for all data analyses reported in this manuscript https://osf.io/d8qmu/

Glossary

Multiple mini-interviews: Are a selection procedure often used for the admission into medical school. Based on the principles of the objective structured clinical examination (OSCE) an MMI involves applicants rotating through a series of short stations each designed to assess one or more personal skills. Each station typically consists of a task, a role-play, a series of questions or unstructured discussion of a topic. Stations are observed by trained interviewers and assessed on pre-defined rating dimensions.

Adapted and extended from: Rees EL, Hawarden AW, Dent G, Hays R, Bates J, Hassell AB. 2016. Evidence regarding the utility of multiple mini-interview (MMI) for selection to undergraduate health programs: A BEME systematic review: BEME Guide No. 37. Med Teach 38(5):443–455.

Notes

1 This was translated from German high-school grades. A GPA of 3.6 translates into a very good high school diploma score of 1.4.

2 We used data from these three samples because they included a large number of identical stations (six). Prior and succeeding samples used different stations and station-dimension combinations so that it was not feasible to merge these datasets. With an N of 376, we had sufficient power (.84) to detect small effect sizes (r = .15).

3 Four stations were not included because one station was not identical across all three samples, two stations focused on a non-social skill (i.e., practical handling), and one station was built solely around the dimension ethical reasoning, which made it impossible to disentangle effects of stations from effects of dimensions.

4 The ratings of emotional understanding were obtained via two items (concerning the emotional understanding of persons A and B after watching a video of a dyadic interaction).

5 In contrast to Jackson et al., we did not include item effects. This was because three out of our four dimensions included one item only (thus confounding effects of items and dimensions). Furthermore, we used the default priors of brms (i.e., improper flat priors), which influence potential results as little as possible.

6 Please note that the amount of (reliable) variance attributable to different components varied depending on the aggregation of ratings (e.g., aggregating ratings to one overall score, aggregating to dimensional scores, aggregating to exercise scores; Putka and Hoffman Citation2013). For this analysis, we focused on the decomposition of non-aggregated post-exercise dimension ratings. When aggregating to dimensional or overall scores, applicant main effects surpassed applicant-exercise interaction effects. However, applicant-dimension interaction effects were negligible, even when aggregating into dimensional scores. For all variance components of aggregated scores, see https://osf.io/d8qmu/

Putka DJ, Hoffman BJ. 2013. Clarifying the contribution of assessee-, dimension-, exercise-, and assessor-related effects to reliable and unreliable variance in assessment center ratings. J Appl Psychol. 98(1):114–133.

PubMed Web of Science ®Google Scholar

Additional information

Funding

This work was supported by the Bundesministerium für Bildung und Forschung [Federal Ministry of Education and Research, Germany; project number: 01GK1801A].

Notes on contributors

Simon M. Breil

Simon M. Breil, M. Sc., is a researcher at the Department of Psychology, Psychological Assessment and Personality Psychology at the University of Münster.

Boris Forthmann

Boris Forthmann, PhD, is a researcher at the department of psychology in education, assessment and evaluation in schools at the University of Münster.

Anike Hertel-Waszak

Anike Hertel-Waszak, PhD, is a researcher at the Institute for Education and Academic Affairs at the Medical Department of the University of Münster.

Helmut Ahrens

Helmut Ahrens, PhD, is a researcher at the Institute for Education and Academic Affairs at the Medical Department of the University of Münster.

Britta Brouwer

Britta Brouwer, PhD, is a researcher at the Institute for Education and Academic Affairs at the Medical Department of the University of Münster.

Eva Schönefeld

Eva Schönefeld, PhD, is a researcher at the Institute for Education and Academic Affairs at the Medical Department of the University of Münster.

Bernhard Marschall

Bernhard Marschall, PhD, is a professor at the Institute for Education and Academic Affairs at the Medical Department of the University of Münster.

Mitja D. Back

Mitja Back is a professor for psychological assessment and personality psychology at the Psychological Department of the University of Münster.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 65.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 771.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

Construct validity of multiple mini interviews – Investigating the role of stations, skills, and raters using Bayesian G-theory

Notes on contributors

Simon M. Breil

Boris Forthmann

Anike Hertel-Waszak

Helmut Ahrens

Britta Brouwer

Eva Schönefeld

Bernhard Marschall

Mitja D. Back

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Construct validity of multiple mini interviews – Investigating the role of stations, skills, and raters using Bayesian G-theory

Abstract

Disclosure statement

Glossary

Notes

Notes

Additional information

Funding

Notes on contributors

Simon M. Breil

Boris Forthmann

Anike Hertel-Waszak

Helmut Ahrens

Britta Brouwer

Eva Schönefeld

Bernhard Marschall

Mitja D. Back

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature