292
Views
0
CrossRef citations to date
0
Altmetric
Original Research

Machine learning-based prediction of drug approvals using molecular, physicochemical, clinical trial, and patent-related features

& ORCID Icon
Pages 1425-1441 | Received 14 Jul 2022, Accepted 28 Nov 2022, Published online: 13 Dec 2022

References

  • Rifaioglu AS, Atas H, Martin M, et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform. 2019;20(5):1878–1912.
  • Doğan T, Güzelcan EA, Baumann M, et al. Protein domain-based prediction of compound–target interactions and experimental validation on LIM kinases. PLoS Comput Biol. 2021;17(11):e1009171.
  • Doğan T, Atas H, Joshi V, et al. CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations. Nucleic Acids Res. 2021;49(16):e96.
  • Rifaioglu AS, Nalbat E, Atalay MV, et al. DEEPScreen: high performance drug-target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem Sci. 2020;11(9):2531–2557.
  • Rifaioglu AS, Cetin-Atalay R, Cansen-Kahraman D, et al. MDeePred: novel multi-channel protein featurization for deep learning based binding affinity prediction in drug discovery. Bioinformatics. 2020;37(5):693–704.
  • Yang K, Swanson K, Jin W, et al. Analyzing learned molecular representations for property prediction. J Chem Inf Model. 2019;59(8):3370–3388.
  • Doğan T. HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences. PeerJ. 2018;6:e5298.
  • Rifaioglu AS, Doğan T, Saraç Ö, et al. Large‐scale automated function prediction of protein sequences and an experimental case study validation on PTEN transcript variants. Proteins Struct Funct Bioinf. 2018;86(2):135–151.
  • Doğan T, MacDougall A, Saidi R, et al. UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB. Bioinformatics. 2016;32(15):2264–2271.
  • Malik L, Mejia A, Parsons H, et al. Predicting success in regulatory approval from phase I results. Cancer Chemother Pharmacol. 2014;74(5):1099–1103.
  • DiMasi JA, Hermann JC, Twyman K, et al. A tool for predicting regulatory approval after phase II testing of new oncology compounds. Clin Pharmacol Ther. 2015;98(5):506–513.
  • Gayvert KM, Madhukar NS, Elemento O. A data-driven approach to predicting successes and failures of clinical trials. Cell Chem Biol. 2016;23:1294–1301.
  • Artemov AV, Putin E, Vanhaelen Q, et al. Integrated deep learned transcriptomic and structure-based predictor of clinical trials outcomes. 2016. Accessed on 14 July 2022. 21. Available from: https://www.biorxiv.org/content/10.1101/095653v2.
  • Jardim DL, Groves ES, Breitfeld PP, et al. Factors associated with failure of oncology drugs in late-stage clinical development: a systematic review. Cancer Treat Rev. 2017;52:12–21.
  • Lo AW, Siah KW, Wong CH. Machine learning with statistical imputation for predicting drug approvals. 2018. Accessed on 14 July 2022. 41. Available frrom https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2973611.
  • Feijoo F, Palopoli M, Bernstein J, et al. Key indicators of phase transition for clinical trials through machine learning. Drug Discovery Today. 2020; 25: 414–421.
  • Zhavoronkov A, Kudrin R, Tutubalina E, et al. Multimodal AI engine for clinical trials outcome prediction: prospective case study summer 2020. 2020. Accessed on 14 July 2022. 13. Available from: https://www.researchgate.net/publication/342354346_Multimodal_AI_Engine_for_Clinical_Trials_Outcome_Prediction_Prospective_Case_Study_Summer_2020.
  • Fu T, Huang K, Xiao C, et al. HINT: hierarchical interaction network for clinical-trial-outcome predictions. Patterns. 2022;3(4):1–12.
  • Zarin DA, Tse T, Williams RJ, et al. The ClinicalTrials.gov results database—update and key issues. N Engl J Med. 2011;364(9):852–860.
  • Wishart DS, Knox C, Guo AC, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;1:34. D668-72.16381955.
  • Gaulton A, Hersey A, Nowotka M, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45(D1):D945–D954.
  • Papadatos G, Davies M, Chambers J, et al. SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res. 2016;44(D1):D1220–8.
  • PatentsView database [Internet]. United States Patent and Trademark Office (USPTO). cited 2019 Dec 2]. Available from 2019 Dec 2: https://www.patentsview.org/download/.
  • Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–754.
  • Landrum G. RDKit: open-source cheminformatics. 2006. [cited 2022 May 22]. Available from http://www.rdkit.org.
  • Valance EH. Understanding the Markush claim in chemical patents. J Chem Doc. 1961;1:87–92.
  • Breiman L. Random Forests. Mach Learn. 2001;45(1):5–32.
  • Probst P, Wright MN, Boulesteix A. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2019;9.
  • Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011 Oct;12:2825–2830.
  • Brunoni AR, Tadini L, Fregni F. Changes in clinical trials methodology over time: a systematic review of six decades of research in psychopharmacology. PLoS One. 2010;5(3):e9479.
  • Breiman L, Friedman J, Olshen R. Classification and regression trees. New York (NY): Chapman and Hall; 1984.
  • Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:1–13.
  • Brodersen KH, Ong CN, Stephan KE, et al. The balanced accuracy and its posterior distribution. 20th International Conference on Pattern Recognition (ICPR); cited 2010 Aug 23-26; Istanbul, Turkey. p. 3121–3124.
  • Behera B, Kumaravelan G, Kumar BP. Performance evaluation of deep learning algorithms in biomedical document classification. 11th international conference on advanced computing (ICoAC); cited 2019 Dec 18-20; Chennai, India. p. 220–224.
  • Opitz J, Burst S. Macro f1 and macro f1. 2021. Accessed on 14 July 2022. 12. Available from: https://arxiv.org/abs/1911.03347.
  • Lipinski CA, Lombardo F, Dominy BW, et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 2001;46(1–3):3–26.
  • Ghose AK, Viswanadhan VN, Wendoloski JJ. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. A qualitative and quantitative characterization of known drug databases. J Comb Chem. 1999;1(1):55–68.
  • Ritchie TJ, Macdonald JF. The impact of aromatic ring count on compound developability – are too many aromatic rings a liability in drug design? Drug Discov Today. 2009;14(21–22):1011–1020.
  • Veber DF, Johnson SR, Cheng HY, et al. Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem. 2002;45(12):2615–2623.
  • Vistoli G, Pedretti A, Testa B. Assessing drug-likeness–what are we missing? Drug Discov Today. 2008;13(7–8):285–294.
  • Leeson PD, Springthorpe B. The influence of drug-like concepts on decision-making in medicinal chemistry. Nat Rev Drug Discov. 2007;26(11):881–890.
  • Raymond MR. Missing data in evaluation research. Eval Health Prof. 1986;9(4):395–420.
  • Tsikriktsis N. A review of techniques for treating missing data in om survey research. J Oper Manage. 2005;24(1):53–62.
  • Tolles J, Meurer WJ. Logistic regression relating patient characteristics to outcomes. Jama. 2016;316(5):533–534.
  • Siramshetty VB, Nickel J, Omieczynski C, et al. WITHDRAWN—a resource for withdrawn and discontinued drugs. Nucleic Acids Res. 2016;44(D1):D1080–D1086.
  • Zhang X, Zhang Y, Ye X, et al. Overview of phase IV clinical trials for postmarket drug safety surveillance: a status report from the ClinicalTrials.gov registry. BMJ Open. 2016;6:e010643.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.