Views

CrossRef citations to date

Altmetric

Research Article

ChatGPT in medical school: how successful is AI in progress testing?

Hendrik Friederichsa Medical School OWL, Bielefeld University, Bielefeld, GermanyCorrespondence[email protected]

https://orcid.org/0000-0001-9671-5235 View further author information

Wolf Jonas Friederichsb Faculty of Mechanical Engineering, RWTH Aachen University, Aachen, Germany

https://orcid.org/0000-0003-1733-7788 View further author information

Maren Märzc Charité– Universitätsmedizin Berlin, Kooperationspartner der Freien Universität Berlin, Humboldt-Universität Zu Berlin, Progress Test Medizin, Charitéplatz 1, Berlin, Germany

https://orcid.org/0000-0002-2661-5076 View further author information

ABSTRACT

Background

As generative artificial intelligence (AI), ChatGPT provides easy access to a wide range of information, including factual knowledge in the field of medicine. Given that knowledge acquisition is a basic determinant of physicians’ performance, teaching and testing different levels of medical knowledge is a central task of medical schools. To measure the factual knowledge level of the ChatGPT responses, we compared the performance of ChatGPT with that of medical students in a progress test.

Methods

A total of 400 multiple-choice questions (MCQs) from the progress test in German-speaking countries were entered into ChatGPT’s user interface to obtain the percentage of correctly answered questions. We calculated the correlations of the correctness of ChatGPT responses with behavior in terms of response time, word count, and difficulty of a progress test question.

Results

Of the 395 responses evaluated, 65.5% of the progress test questions answered by ChatGPT were correct. On average, ChatGPT required 22.8 s (SD 17.5) for a complete response, containing 36.2 (SD 28.1) words. There was no correlation between the time used and word count with the accuracy of the ChatGPT response (correlation coefficient for time rho = −0.08, 95% CI [−0.18, 0.02], t(393) = −1.55, p = 0.121; for word count rho = −0.03, 95% CI [−0.13, 0.07], t(393) = −0.54, p = 0.592). There was a significant correlation between the difficulty index of the MCQs and the accuracy of the ChatGPT response (correlation coefficient for difficulty: rho = 0.16, 95% CI [0.06, 0.25], t(393) = 3.19, p = 0.002).

Conclusion

ChatGPT was able to correctly answer two-thirds of all MCQs at the German state licensing exam level in Progress Test Medicine and outperformed almost all medical students in years 1–3. The ChatGPT answers can be compared with the performance of medical students in the second half of their studies.

KEYWORDS:

Acknowledgments

The authors wish to thank Iván Roselló Atanet of the AG Progress Test Medizin for providing progress test data.

Disclosure statement

No potential conflict of interest was reported by the authors.

Author contributions

HF designed the study and participated in its data collection, data analysis, and coordination. WJF participated in data collection and data analysis. MM participated in the conception, coordination, and design of the study. All authors interpreted the results, drafted the manuscript, and approved the final version. All authors are accountable for all aspects of the work.

Ethics approval and consent to participate

Not applicable

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

ChatGPT in medical school: how successful is AI in progress testing?

Background

Methods

Results

Conclusion

Information for

Open access

Opportunities

Help and information

ChatGPT in medical school: how successful is AI in progress testing?

ABSTRACT

Background

Methods

Results

Conclusion

Acknowledgments

Disclosure statement

Author contributions

Ethics approval and consent to participate

Additional information

Funding

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature