4,847
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Presenting artificial intelligence, deep learning, and machine learning studies to clinicians and healthcare stakeholders: an introductory reference with a guideline and a Clinical AI Research (CAIR) checklist proposal

, , , , , , & show all

  • Adamson A S, Smith A. Machine learning and health care disparities in dermatology. JAMA Dermatol 2018; 154(11): 1247.
  • Anderson P, Fernando B, Johnson M, Gould S. SPICE: Semantic Propositional Image Caption Evaluation. arXiv:160708822 [cs] [Internet] 2016 Jul 29 [cited 2020 Nov 30]. Available from: http://arxiv.org/abs/1607.08822
  • Badgeley M A, Zech J R, Oakden-Rayner L, Glicksberg B S, Liu M, Gale W, et al. Deep learning predicts hip fracture using confounding patient and healthcare variables. NPJ Digit Med 2019; 2(1): 1–10.
  • Bandos A I, Obuchowski N A. Evaluation of diagnostic accuracy in free-response detection-localization tasks using ROC tools. Stat Methods Med Res 2019; 28(6): 1808–25.
  • Banerjee S, Lavie A. METEOR: an automatic metric for mt evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization [Internet]. Ann Arbor, MI: Association for Computational Linguistics; 2005 [cited 2020 Nov 30]. p. 65–72. Available from: https://www.aclweb.org/anthology/W05-0909
  • Beil M, Proft I, van Heerden D, Sviri S, van Heerden P V. Ethical considerations about artificial intelligence for prognostication in intensive care. Intensive Care Med Exp 2019; 7(1): 70.
  • Chakraborty D P. A brief history of free-response receiver operating characteristic paradigm data analysis. Academic Radiology 2013; 20(7): 915–9.
  • Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 2020; 21(1): 6.
  • Esteva A, Kuprel B, Novoa R A, Ko J, Swetter S M, Blau H M, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017; 542(7639): 115.
  • Felländer-Tsai L. AI ethics, accountability, and sustainability: revisiting the Hippocratic oath. Acta Orthop 2020; 91(1): 1–2.
  • Fleming N. How artificial intelligence is changing drug discovery. Nature 2018; 557(7707): S55–S7.
  • Gale W, Oakden-Rayner L, Carneiro G, Bradley A P, Palmer L J. Producing radiologist-quality reports for interpretable artificial intelligence. arXiv:180600340 [cs] [Internet] 2018 Jun 1 [cited 2020 Nov 30]. Available from: http://arxiv.org/abs/1806.00340
  • Gulshan V, Peng L, Coram M, Stumpe M C, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016; 316(22): 2402.
  • Gulshan V, Rajan RP, Widner K, Wu D, Wubbels P, et al. Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in India. JAMA Opthalmol 2019; 137(9): 987–93.
  • Hedlund J, Eklund A, Lundström C. Key insights in the AIDA community policy on sharing of clinical imaging data for research in Sweden. Scientific Data 2020; 7(1): 331.
  • Kamulegeya L H, Okello M, Bwanika J M, Musinguzi D, Lubega W, Rusoke D, et al. Using artificial intelligence on dermatology conditions in Uganda: a case for diversity in training data sets for machine learning. bioRxiv 2019 Oct 31; 826057.
  • Kougia V, Pavlopoulos J, Androutsopoulos I. A survey on biomedical image captioning. arXiv:190513302 [cs] [Internet] 2019 May 26 [cited 2020 Dec 1]. Available from: http://arxiv.org/abs/1905.13302
  • Lin C-Y. ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out [Internet]. Barcelona, Spain: Association for Computational Linguistics; 2004 [cited 2020 Nov 30]. p. 74–81. Available from: https://www.aclweb.org/anthology/W04-1013
  • Liu X, Cruz Rivera S, Moher D, Calvert M J, Denniston A K. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med 2020a; 26(9): 1364–74.
  • Liu X, Rivera S C, Faes L, Keane P A, Moher D, Calvert M, et al. CONSORT-AI and SPIRIT-AI: new reporting guidelines for clinical trials and trial protocols for artificial intelligence interventions. Invest Ophthalmol Vis Sci 2020b; 61(7): 1617–1617.
  • Lupton M. Some ethical and legal consequences of the application of artificial intelligence in the field of medicine. Trends Med [Internet] 2018 [cited 2020 Oct 17]; 18(4). Available from: https://www.oatext.com/some-ethical-and-legal-consequences-of-the-application-of-artificial-intelligence-in-the-field-of-medicine.php
  • Manning C. Artificial intelligence definitions [Internet] 2020 [cited 2020 Nov 26]. Available from: https://hai.stanford.edu/sites/default/files/2020-09/AI-Definitions-HAI.pdf
  • Michie D, Spiegelhalter D J, Taylor C C. Machine learning, neural and statistical classification 1994 [cited 2016 Dec 7]. Available from: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.27.355
  • Mittelstadt B D, Allo P, Taddeo M, Wachter S, Floridi L. The ethics of algorithms: mapping the debate. Big Data & Society 2016; 3(2): 2053951716679679.
  • Nicolaes J, Raeymaeckers S, Robben D, Wilms G, Vandermeulen D, Libanati C, et al. Detection of vertebral fractures in CT using 3D convolutional neural networks. arXiv:191101816 [cs, eess] [Internet] 2019 Nov 5 [cited 2020 Oct 9]. Available from: http://arxiv.org/abs/1911.01816
  • Obuchowski N A, Lieber M L, Powell K A. Data analysis for detection and localization of multiple abnormalities with application to mammography. Acad Radiol 2000; 7(7): 516–25.
  • Olczak J, Emilson F, Razavian A, Antonsson T, Stark A, Gordon M. Ankle fracture classification using deep learning: automating detailed AO Foundation/Orthopedic Trauma Association (AO/OTA) 2018 malleolar fracture identification reaches a high degree of correct classification. Acta Orthop 2021; 92(1): 102–8.
  • Pandis N, Fedorowicz Z. The International EQUATOR Network: enhancing the quality and transparency of health care research. J Appl Oral Sci 2011; 19(5): 0.
  • Papineni K, Roukos S, Ward T, Zhu W-J. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics [Internet]. Stroudsburg, PA: Association for Computational Linguistics; 2002 [cited 2016 Nov 3]. p. 311-318. (ACL ’02). Available from: http://dx.doi.org/10.3115/1073083.1073135
  • Paul D, Sanap G, Shenoy S, Kalyane D, Kalia K, Tekade R K. Artificial intelligence in drug discovery and development. Drug Discovery Today [Internet] 2020 Oct 21 [cited 2020 Nov 26]. Available from: http://www.sciencedirect.com/science/article/pii/S1359644620304256
  • Pocevičiūtė M, Eilertsen G, Lundström C. Survey of XAI in digital pathology. In: Holzinger A, Goebel R, Mengel M, Müller H, editors. Artificial intelligence and machine learning for digital pathology. Lecture Notes in Computer Science, vol. 12090, 2020.
  • Qi Y, Zhao J, Shi Y, Zuo G, Zhang H, Long Y, et al. Ground truth annotated femoral X-ray image dataset and object detection based method for fracture types classification. IEEE Access 2020; 8: 189436–44.
  • Rivera S C, Liu X, Chan A-W, Denniston A K, Calvert M J, Ashrafian H, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Lancet Digital Health 2020; 2(10): e549-60.
  • Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 2019; 1(5): 206–15.
  • Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 2015; 10(3): e0118432.
  • Sendak M P, Gao M, Brajer N, et al. Presenting machine learning model information to clinical end users with model facts labels. NPJ Digit Med 2020; 3(41). https://doi.org/https://doi.org/10.1038/s41746-020-0253-3
  • Shim E, Kim J Y, Yoon J P, Ki S-Y, Lho T, Kim Y, et al. Automated rotator cuff tear classification using 3D convolutional neural network. Scientific Reports 2020; 10(1): 15632.
  • Vedantam R, Zitnick C L, Parikh D. CIDEr: fonsensus-based image description evaluation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015. p. 4566–75.