636
Views
7
CrossRef citations to date
0
Altmetric
Articles

ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination

, , &
Pages 366-372 | Published online: 15 Oct 2023
 

Abstract

Purpose

ChatGPT-4 is an upgraded version of an artificial intelligence chatbot. The performance of ChatGPT-4 on the United States Medical Licensing Examination (USMLE) has not been independently characterized. We aimed to assess the performance of ChatGPT-4 at responding to USMLE Step 1, Step 2CK, and Step 3 practice questions.

Method

Practice multiple-choice questions for the USMLE Step 1, Step 2CK, and Step 3 were compiled. Of 376 available questions, 319 (85%) were analyzed by ChatGPT-4 on March 21st, 2023. Our primary outcome was the performance of ChatGPT-4 for the practice USMLE Step 1, Step 2CK, and Step 3 examinations, measured as the proportion of multiple-choice questions answered correctly. Our secondary outcomes were the mean length of questions and responses provided by ChatGPT-4.

Results

ChatGPT-4 responded to 319 text-based multiple-choice questions from USMLE practice test material. ChatGPT-4 answered 82 of 93 (88%) questions correctly on USMLE Step 1, 91 of 106 (86%) on Step 2CK, and 108 of 120 (90%) on Step 3. ChatGPT-4 provided explanations for all questions. ChatGPT-4 spent 30.8 ± 11.8 s on average responding to practice questions for USMLE Step 1, 23.0 ± 9.4 s per question for Step 2CK, and 23.1 ± 8.3 s per question for Step 3. The mean length of practice USMLE multiple-choice questions that were answered correctly and incorrectly by ChatGPT-4 was similar (difference = 17.48 characters, SE = 59.75, 95%CI = [-100.09,135.04], t = 0.29, p = 0.77). The mean length of ChatGPT-4’s correct responses to practice questions was significantly shorter than the mean length of incorrect responses (difference = 79.58 characters, SE = 35.42, 95%CI = [9.89,149.28], t = 2.25, p = 0.03).

Conclusions

ChatGPT-4 answered a remarkably high proportion of practice questions correctly for USMLE examinations. ChatGPT-4 performed substantially better at USMLE practice questions than previous models of the same AI chatbot.

Disclosure statement

The views expressed herein are those of the authors and do not necessarily reflect the position of the Federation of State Medical Boards or National Board of Medical Examiners. Information reported in this manuscript has not been previously presented at a conference. Data were collected from the artificial intelligence chatbot ChatGPT developed by OpenAI. As corresponding author, Rajeev H. Muni had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Data availability statement

The data that support the findings of this study may be requested at [email protected], with support from the principal investigator RHM.

Additional information

Funding

MMP: Financial support (to institution) – PSI Foundation, Fighting Blindness Canada. RHM: Consultant – Alcon, Apellis, AbbVie, Bayer, Bausch Health, Roche; Financial Support (to institution) – Alcon, AbbVie, Bayer, Novartis, Roche.

Notes on contributors

Andrew Mihalache

Andrew Mihalache is a MD candidate at the University of Toronto in Toronto, Ontario under the Temerty Faculty of Medicine.

Ryan S. Huang

Ryan S. Huang is a MD candidate at the University of Toronto in Toronto, Ontario, under the Temerty Faculty of Medicine.

Marko M. Popovic

Marko M. Popovic is the Chief Ophthalmology Resident in the Department of Ophthalmology and Vision Sciences at the University of Toronto and has completed a Master of Public Health at the Harvard T.H. Chan School of Public Health.

Rajeev H. Muni

Rajeev H. Muni is a staff vitreoretinal surgeon at St. Michael’s Hospital in Toronto, Ontario, Associate Professor and Vice-Chair of Clinical Research in the Department of Ophthalmology and Vision Sciences at the University of Toronto.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 771.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.