Search in:

Medical Teacher Volume 46, 2024 - Issue 3

Submit an article Journal homepage

712

Views

CrossRef citations to date

Altmetric

Articles

ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination

Andrew Mihalachea Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, CanadaView further author information

Ryan S. Huanga Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, CanadaView further author information

Marko M. Popovicb Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, CanadaView further author information

Rajeev H. Munib Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada;c Department of Ophthalmology, St. Michael’s Hospital/Unity Health Toronto, Toronto, Ontario, CanadaCorrespondence[email protected]
View further author information

Pages 366-372 | Published online: 15 Oct 2023

Cite this article
https://doi.org/10.1080/0142159X.2023.2249588
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Altman D, Machin D, Bryant T, Gardner M. 2000. Statistics with confidence 2nd ed. [Internet]. [accessed 2023 Jan 21]. http://books.google.com/books?hl=en&lr=&id=Rfdg1MFx7mcC&oi=fnd&pg=PR11&dq=Statistics+With+Confidence&ots=7N6OkQBqd7&sig=JaOmvHLTmTt8P1aIBysMBMU5kes.
Google Scholar
Altman DG. 1990. Practical statistics for medical research. Pract Stat Med Res [Internet]. [accessed 2023 Jan 21]. https://www.medcalc.org/calc/comparison_of_means.php.
Google Scholar
Aydın Ö, Karaarslan E. 2022. OpenAI ChatGPT generated literature review: digital twin in healthcare. SSRN J. 2:22–31. doi:10.2139/ssrn.4308687.
Google Scholar
Azaria A. ChatGPT usage and limitations doi:10.13140/RG.2.2.26616.11526.
Google Scholar
Biswas S. 2023. ChatGPT and the future of medical writing. Radiology [Internet]. 307[(2):e223312. doi:10.1148/radiol.223312.
Web of Science ®Google Scholar
Cahan P, Treutlein B. 2023. A conversation with ChatGPT on the role of computational systems biology in stem cell research. Stem Cell Rep [Internet]. 18(1):1–2. doi:10.1016/j.stemcr.2022.12.009.
PubMed Web of Science ®Google Scholar
Cai LZ, Shaheen A, Jin A, Fukui R, Yi JS, Yannuzzi N, Alabiad C. 2023. Performance of generative large language models on ophthalmology board style questions. Am J Ophthalmol. 254:141–149. doi:10.1016/j.ajo.2023.05.024.
PubMed Web of Science ®Google Scholar
Campbell I. 2007. Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Stat Med. 26(19):3661–3675. doi:10.1002/sim.2832.
PubMed Web of Science ®Google Scholar
ChatGPT Generative Pre-trained Transformer, Zhavoronkov A. 2022. Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective. Oncoscience [Internet]. 9:82–84. doi:10.18632/oncoscience.571.
PubMedGoogle Scholar
Cohen ER, Goldstein JL, Schroedl CJ, Parlapiano N, McGaghie WC, Wayne DB. 2020. Are USMLE scores valid measures for chief resident selection? J Grad Med Educ. 12(4):441–446. doi:10.4300/JGME-D-19-00782.1.
PubMedGoogle Scholar
Comparison of proportions calculator. 2023. [accessed 2023 Jan 21]. https://www.medcalc.net/statisticaltests/comparison_of_proportions.php.
Google Scholar
Else H. 2023. Abstracts written by ChatGPT fool scientists. Nature [Internet]. 613(7944):423. accessed 2023 Jan 22] doi:10.1038/d41586-023-00056-7.
Web of Science ®Google Scholar
Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, Pearson AT. 2022. Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. npj Digit. [Internet]. [accessed 2023 Aug 17]: doi:10.1101/2022.12.23.521610.
Google Scholar
Giannos P. 2023. Evaluating the limits of AI in medical specialisation: chatGPT’s performance on the UK Neurology Specialty Certificate Examination. BMJ Neurol Open. 5(1):e000451. doi:10.1136/bmjno-2023-000451.
PubMedGoogle Scholar
Gilson A, Safranek C, Huang T, Socrates V, CL, Taylor RA, Chartash D. 2022. How does ChatGPT perform on the medical licensing exams? The implications of large language models for medical education and knowledge assessment. medRxiv [Internet]. [accessed 2023 Jan 18]. doi:10.1101/2022.12.23.22283901.
Google Scholar
GPT-4. [accessed 2023 Mar 20]. https://openai.com/product/gpt-4.
Google Scholar
Hu R, Fan KY, Pandey P, Hu Z, Yau O, Teng M, Wang P, Li A, Ashraf M, Singla R. 2022. Insights from teaching artificial intelligence to medical students in Canada. Commun Med. 2(1):1–5. doi:10.1038/s43856-022-00125-4.
PubMedGoogle Scholar
Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, Weber T, Wesp P, Sabel B, Ricke J, et al. 2022. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports [Internet]. [accessed 2023 Jan 22] doi:10.48550/arxiv.2212.14882.
Google Scholar
Katz DM, Bommarito MJ, Gao S, Arredondo P. 2023. GPT-4 passes the bar exam. SSRN J. doi:10.2139/ssrn.4389233.
Google Scholar
Kirkwood BB, Sterne J. 2003. Essential medical statistics [Internet].:395–412. [accessed 2023 Jan 21]. http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Essential+Medical+Statistics#0.
Google Scholar
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, et al. 2023. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2(2):e0000198. doi:10.1371/JOURNAL.PDIG.0000198.
PubMedGoogle Scholar
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, et al. 2023. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. [2(2):e0000198. doi:10.1371/JOURNAL.PDIG.0000198.
PubMedGoogle Scholar
Liévin V, Hother CE, Winther O. 2022. Can large language models reason about medical questions? [Internet]. [accessed 2023 Mar 22]. http://arxiv.org/abs/2207.08143.
Google Scholar
MedCalc. 2023. MedCalc’s comparison of means calculator [Internet]. [accessed 2023 Jan 21]. https://www.medcalc.org/calc/comparison_of_means.php.
Google Scholar
Mihalache A, Huang RS, Popovic MM, Muni RH. 2023. Performance of an upgraded artificial intelligence Chatbot for ophthalmic knowledge assessment. JAMA Ophthalmol. 141: 798–800. doi:10.1001/JAMAOPHTHALMOL.2023.2754.
PubMed Web of Science ®Google Scholar
Mihalache A, Popovic MM, Muni RH. 2023. Performance of an artificial intelligence Chatbot in an ophthalmic knowledge assessment. JAMA Ophthalmol. 141(6):589–597. doi:10.1001/jamaophthalmol.2023.1144.
PubMed Web of Science ®Google Scholar
Milmo D. 2023. ChatGPT reaches 100 million users two months after launch | Chatbots | The Guardian. Guard [Internet]. [accessed 2023 Mar 20]. https://www.theguardian.com/technology/2023/feb/02/chatgpt-100-million-users-open-ai-fastest-growing-app.
Google Scholar
O’Connor, S, ChatGPT. 2023. Open artificial intelligence platforms in nursing education: tools for academic progress or abuse? Nurse Educ Pract. 66:103537. doi:10.1016/j.nepr.2022.103537.
PubMed Web of Science ®Google Scholar
OpenAI. 2023. GPT-4 technical report [Internet]. [accessed 2023 Mar 20]. http://arxiv.org/abs/2303.08774.
Google Scholar
Passby L, Jenko N, Wernham A. 2023. Performance of ChatGPT on dermatology specialty certificate examination multiple choice questions. Clin Exp Dermatol. doi:10.1093/ced/llad197.
Web of Science ®Google Scholar
Richardson JTE. 2011. The analysis of 2 × 2 contingency tables-Yet again. Stat Med. 30(8):890. doi:10.1002/sim.4116.
PubMed Web of Science ®Google Scholar
Sanderson K. 2023. GPT-4 is here: what scientists think. Nature [Internet]. 615(7954):773–773. doi:10.1038/d41586-023-00816-5.
Web of Science ®Google Scholar
Statistics SS. One-way ANOVA calculator, including Tukey HSD [Internet]. https://www.socscistatistics.com/tests/anova/default2.aspx.
Google Scholar
Step 1 Exam Content | USMLE. [accessed 2023 Mar 22]. https://www.usmle.org/step-exams/step-1/step-1-exam-content.
Google Scholar
Step 2 CK Exam Content | USMLE. [accessed 2023 Mar 22]. https://www.usmle.org/step-exams/step-2-ck/step-2-ck-exam-content.
Google Scholar
Step 2 Clinical Knowledge (CK) SAMPLE TEST QUESTIONS. 2023.
Google Scholar
Step 3 Exam Content | USMLE. [accessed 2023 Mar 22]. https://www.usmle.org/step-exams/step-3/step-3-exam-content.
Google Scholar
Step 3 Sample Questions August 2022. 2022.
Google Scholar
Stokel-Walker C. 2022. AI bot ChatGPT writes smart essays—should academics worry? Nature. doi:10.1038/d41586-022-04397-7.
Google Scholar
Susnjak T. 2022. ChatGPT: the end of online exam integrity? [Internet]. [accessed 2023 Jan 20] doi:10.48550/arxiv.2212.09292.
Google Scholar
USMLE Step 1 Sample Test Questions. 2022.
Google Scholar
Zhai X. 2022. ChatGPT user experience: implications for education. SSRN J. [accessed 2023 Jan 20] doi:10.2139/ssrn.4312418.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date