References
- Rifaioglu AS, Atas H, Martin M, et al. Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases. Brief Bioinform. 2019;20(5):1878–1912.
- Doğan T, Güzelcan EA, Baumann M, et al. Protein domain-based prediction of compound–target interactions and experimental validation on LIM kinases. PLoS Comput Biol. 2021;17(11):e1009171.
- Doğan T, Atas H, Joshi V, et al. CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations. Nucleic Acids Res. 2021;49(16):e96.
- Rifaioglu AS, Nalbat E, Atalay MV, et al. DEEPScreen: high performance drug-target interaction prediction with convolutional neural networks using 2-D structural compound representations. Chem Sci. 2020;11(9):2531–2557.
- Rifaioglu AS, Cetin-Atalay R, Cansen-Kahraman D, et al. MDeePred: novel multi-channel protein featurization for deep learning based binding affinity prediction in drug discovery. Bioinformatics. 2020;37(5):693–704.
- Yang K, Swanson K, Jin W, et al. Analyzing learned molecular representations for property prediction. J Chem Inf Model. 2019;59(8):3370–3388.
- Doğan T. HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences. PeerJ. 2018;6:e5298.
- Rifaioglu AS, Doğan T, Saraç Ö, et al. Large‐scale automated function prediction of protein sequences and an experimental case study validation on PTEN transcript variants. Proteins Struct Funct Bioinf. 2018;86(2):135–151.
- Doğan T, MacDougall A, Saidi R, et al. UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB. Bioinformatics. 2016;32(15):2264–2271.
- Malik L, Mejia A, Parsons H, et al. Predicting success in regulatory approval from phase I results. Cancer Chemother Pharmacol. 2014;74(5):1099–1103.
- DiMasi JA, Hermann JC, Twyman K, et al. A tool for predicting regulatory approval after phase II testing of new oncology compounds. Clin Pharmacol Ther. 2015;98(5):506–513.
- Gayvert KM, Madhukar NS, Elemento O. A data-driven approach to predicting successes and failures of clinical trials. Cell Chem Biol. 2016;23:1294–1301.
- Artemov AV, Putin E, Vanhaelen Q, et al. Integrated deep learned transcriptomic and structure-based predictor of clinical trials outcomes. 2016. Accessed on 14 July 2022. 21. Available from: https://www.biorxiv.org/content/10.1101/095653v2.
- Jardim DL, Groves ES, Breitfeld PP, et al. Factors associated with failure of oncology drugs in late-stage clinical development: a systematic review. Cancer Treat Rev. 2017;52:12–21.
- Lo AW, Siah KW, Wong CH. Machine learning with statistical imputation for predicting drug approvals. 2018. Accessed on 14 July 2022. 41. Available frrom https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2973611.
- Feijoo F, Palopoli M, Bernstein J, et al. Key indicators of phase transition for clinical trials through machine learning. Drug Discovery Today. 2020; 25: 414–421.
- Zhavoronkov A, Kudrin R, Tutubalina E, et al. Multimodal AI engine for clinical trials outcome prediction: prospective case study summer 2020. 2020. Accessed on 14 July 2022. 13. Available from: https://www.researchgate.net/publication/342354346_Multimodal_AI_Engine_for_Clinical_Trials_Outcome_Prediction_Prospective_Case_Study_Summer_2020.
- Fu T, Huang K, Xiao C, et al. HINT: hierarchical interaction network for clinical-trial-outcome predictions. Patterns. 2022;3(4):1–12.
- Zarin DA, Tse T, Williams RJ, et al. The ClinicalTrials.gov results database—update and key issues. N Engl J Med. 2011;364(9):852–860.
- Wishart DS, Knox C, Guo AC, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006;1:34. D668-72.16381955.
- Gaulton A, Hersey A, Nowotka M, et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017;45(D1):D945–D954.
- Papadatos G, Davies M, Chambers J, et al. SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res. 2016;44(D1):D1220–8.
- PatentsView database [Internet]. United States Patent and Trademark Office (USPTO). cited 2019 Dec 2]. Available from 2019 Dec 2: https://www.patentsview.org/download/.
- Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50(5):742–754.
- Landrum G. RDKit: open-source cheminformatics. 2006. [cited 2022 May 22]. Available from http://www.rdkit.org.
- Valance EH. Understanding the Markush claim in chemical patents. J Chem Doc. 1961;1:87–92.
- Breiman L. Random Forests. Mach Learn. 2001;45(1):5–32.
- Probst P, Wright MN, Boulesteix A. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2019;9.
- Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011 Oct;12:2825–2830.
- Brunoni AR, Tadini L, Fregni F. Changes in clinical trials methodology over time: a systematic review of six decades of research in psychopharmacology. PLoS One. 2010;5(3):e9479.
- Breiman L, Friedman J, Olshen R. Classification and regression trees. New York (NY): Chapman and Hall; 1984.
- Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:1–13.
- Brodersen KH, Ong CN, Stephan KE, et al. The balanced accuracy and its posterior distribution. 20th International Conference on Pattern Recognition (ICPR); cited 2010 Aug 23-26; Istanbul, Turkey. p. 3121–3124.
- Behera B, Kumaravelan G, Kumar BP. Performance evaluation of deep learning algorithms in biomedical document classification. 11th international conference on advanced computing (ICoAC); cited 2019 Dec 18-20; Chennai, India. p. 220–224.
- Opitz J, Burst S. Macro f1 and macro f1. 2021. Accessed on 14 July 2022. 12. Available from: https://arxiv.org/abs/1911.03347.
- Lipinski CA, Lombardo F, Dominy BW, et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 2001;46(1–3):3–26.
- Ghose AK, Viswanadhan VN, Wendoloski JJ. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. A qualitative and quantitative characterization of known drug databases. J Comb Chem. 1999;1(1):55–68.
- Ritchie TJ, Macdonald JF. The impact of aromatic ring count on compound developability – are too many aromatic rings a liability in drug design? Drug Discov Today. 2009;14(21–22):1011–1020.
- Veber DF, Johnson SR, Cheng HY, et al. Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem. 2002;45(12):2615–2623.
- Vistoli G, Pedretti A, Testa B. Assessing drug-likeness–what are we missing? Drug Discov Today. 2008;13(7–8):285–294.
- Leeson PD, Springthorpe B. The influence of drug-like concepts on decision-making in medicinal chemistry. Nat Rev Drug Discov. 2007;26(11):881–890.
- Raymond MR. Missing data in evaluation research. Eval Health Prof. 1986;9(4):395–420.
- Tsikriktsis N. A review of techniques for treating missing data in om survey research. J Oper Manage. 2005;24(1):53–62.
- Tolles J, Meurer WJ. Logistic regression relating patient characteristics to outcomes. Jama. 2016;316(5):533–534.
- Siramshetty VB, Nickel J, Omieczynski C, et al. WITHDRAWN—a resource for withdrawn and discontinued drugs. Nucleic Acids Res. 2016;44(D1):D1080–D1086.
- Zhang X, Zhang Y, Ye X, et al. Overview of phase IV clinical trials for postmarket drug safety surveillance: a status report from the ClinicalTrials.gov registry. BMJ Open. 2016;6:e010643.