591
Views
13
CrossRef citations to date
0
Altmetric
Articles

Development and rigorous validation of antimalarial predictive models using machine learning approaches

, , &
Pages 543-560 | Received 18 Apr 2019, Accepted 20 Jun 2019, Published online: 22 Jul 2019

References

  • World Health Organization, Fact Sheet: World Malaria Report 2016, WHO, 2016; available at http://www.who.int/malaria/media/world-malaria-report-2016/en/.
  • W. Peters, Drug resistance in malaria parasites of animals and man, Adv. Parasitol. 41 (1998), pp. 1–62. doi:10.1016/s0065-308x(08)60421-2.
  • B. Blasco, D. Leroy, and D.A. Fidock, Antimalarial drug resistance: Linking Plasmodium falciparum parasite biology to the clinic, Nat. Med. 23 (2017), pp. 917–928. doi:10.1038/nm.4381.
  • A.M. Dondorp, F. Nosten, P. Yi, D. Das, A.P. Phyo, J. Tarning, K.M. Lwin, F. Ariey, W. Hanpithakpong, S.J. Lee, P. Ringwald, K. Silamut, M. Imwong, K. Chotivanich, P. Lim, T. Herdman, S.S. An, S. Yeung, P. Singhasivanon, N.P. Day, N. Lindegardh, D. Socheat, and N.L. White, Artemisinin resistance in Plasmodium falciparum malaria, N. Engl. J. Med. 361 (2009), pp. 455–467. doi:10.1056/NEJMoa0808859.
  • Y. Lubell, A. Dondorp, P.L. Guérin, T. Drake, S. Meek, E. Ashley, N.P. Day, N.J. White, and L.J. White, Artemisinin resistance–modelling the potential human and economic costs, Malar. J. 13 (2014), pp. 452. doi:10.1186/1475-2875-13-452.
  • L. Paloque, A.P. Ramadani, O. Mercereau-Puijalon, J.M. Augereau, and F. Benoit-Vical, Plasmodium falciparum: Multifaceted resistance to artemisinins, Malar. J. 15 (2016), pp. 149. doi:10.1186/s12936-016-1206-9.
  • J.R. Broach and J. Thorner, High-throughput screening for drug discovery, Nature 384 (1996), pp. 14–16. doi:10.1038/384014a0.
  • P. Szymański, M. Markowicz, and E. Mikiciuk-Olasik, Adaptation of high-throughput screening in drug discovery—toxicological screening tests, Int. J. Mol. Sci. 13 (2012), pp. 427–452. doi:10.3390/ijms13010427.
  • T. Zhu, S. Cao, P.C. Su, R. Patel, D. Shah, H.B. Chokshi, R. Szukala, M.E. Johnson, and K.E. Hevener, Hit identification and optimization in virtual screening: Practical recommendations based on a critical literature analysis, J. Med. Chem. 56 (2013), pp. 6560–6572. doi:10.1021/jm301916b.
  • P.D. Lyne, Structure-based virtual screening: An overview, Drug. Discov. Today 7 (2002), pp. 1047–1055.
  • A. Gaulton, L.J. Bellis, A.P. Bento, J. Chambers, M. Davies, A. Hersey, Y. Light, S. McGlinchey, D. Michalovich, B. Al-Lazikani, and J.P. Overington, ChEMBL: A large-scale bioactivity database for drug discovery, Nucleic Acids Res. 40 (2012), pp. D1100–7. doi:10.1093/nar/gkr777.
  • R. Guha and E. Willighagen, A survey of quantitative descriptions of molecular structure, Curr. Top. Med. Chem. 12 (2012), pp. 1946–1956.
  • C.W. Yap, PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints, J. Comput. Chem. 32 (2011), pp. 1466–1474. doi:10.1002/jcc.21707.
  • I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, Gene selection for cancer classification using support vector machines, Mach. Learn. 46 (2002), pp. 389–422. doi:10.1023/A:1012487302797.
  • T. Kohonen, Essentials of the self-organizing map, Neural. Netw. 37 (2013), pp. 52–65. doi:10.1016/j.neunet.2012.09.018.
  • T. Kohonen, Self-organized formation of topologically correct feature maps, Biol. Cybern. 43 (1982), pp. 59–69. doi:10.1007/BF00337288.
  • M.K. Warmuth, J. Liao, G. Rätsch, M. Mathieson, S. Putta, and C. Lemmen, Active learning with support vector machines in the drug discovery process, J. Chem. Inf. Comput. Sci. 43 (2003), pp. 667–673. doi:10.1021/ci025620t.
  • L. Breiman, Random forests, Mach. Learn. 45 (2001), pp. 5–32. doi:10.1023/A:1010933404324.
  • T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, in Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, August 13–17, 2016, pp. 785–794.
  • Z. Voulgaris and G.D. Magoulas, Extensions of the k nearest neighbour methods for classification problems, in Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications (AIA ‘08), ACTA Press, Anaheim, CA, February, 2008, pp. 23–28,
  • T. Mitchell, Machine Learning, McGraw Hill, New York, 1997.
  • P.A. Bradley, The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern. Recogn. 30 (1997), pp. 1145–1159. doi:10.1016/S0031-3203(96)00142-2.
  • G.J. Filion, The signed Kolmogorov-Smirnov test: Why it should not be used, Gigascience 4 (2015), pp. 9. doi:10.1186/s13742-015-0048-7.
  • C. Rücker, G. Rücker, and M. Meringer, y-Randomization and its variants in QSPR/QSAR, J. Chem. Inf. Model. 47 (2007), pp. 2345–2357. doi:10.1021/ci700157b.
  • O. Nicolotti, D. Gadaleta, G.F. Mangiatordi, M. Catto, and A. Carotti, Applicability domain for QSAR models: Where theory meets reality, Int. J. Quant. Struct.-Prop. Relat. 1 (2016), pp. 45–63. doi:10.4018/IJQSPR.2016010102.
  • E. Bura and J.L. Gastwirth, The binary regression quantile plot: Assessing the importance of predictors in binary regression visually, Biom. J. 43 (2001), pp. 5–21. doi:10.1002/1521-4036(200102)43:1<5::AID-BIMJ5>3.0.CO;2-6.
  • M.C. Sachs and X.H. Zhou, Partial summary measures of the predictiveness curve, Biom. J. 55 (2013), pp. 589–602. doi:10.1002/bimj.201200146.
  • C. Empereur-Mot, H. Guillemain, A. Latouche, J.F. Zagury, V. Viallon, and M. Montes, Predictiveness curves in virtual screening, J. Cheminform. 7 (2015), pp. 52. doi:10.1186/s13321-015-0100-8.
  • A.J. Vickers, B. Van Calster, and E.W. Steyerber, Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests, BMJ 352 (2016), pp. i6. doi:10.1136/bmj.i1717.
  • G. Landrum, RDKit documentation, Release 3 (2013), pp. 1–79.
  • L.B. Akella and D. De Caprio, Cheminformatics approaches to analyze diversity in compound screening libraries, Curr. Opin. Chem. Biol. 14 (2010), pp. 325–330. doi:10.1016/j.cbpa.2010.03.017.
  • J.J. Perez, Managing molecular diversity, Chem. Soc. Rev. 34 (2005), pp. 143–152. doi:10.1039/b209064n.
  • X.G. Yang, D. Chen, M. Wang, Y. Xue, and Y.Z. Chen, Prediction of antibacterial compounds by machine learning approaches, J. Comput. Chem. 30 (2009), pp. 1202–1211. doi:10.1002/jcc.21148.
  • S. Sarkar, A.A. Siddiqui, S.J. Saha, R. De, S. Mazumder, C. Banerjee, M.S. Iqbal, S. Nag, S. Adhikari, and U. Bandyopadhyay, Antimalarial activity of small-molecule benzothiazole hydrazones, Antimicrob. Agents Chemother. 60 (2016), pp. 4217–4228. doi:10.1128/AAC.01575-15.
  • K. Roy, D.K. Pal, A. Saha, and C. Sengupta, QSAR with electrotopological state atom index: Part II ± antimalarial activity of dihydroqinghaosu derivatives, Indian J. Chem. 40 (2001), pp. 587–595.
  • X. Hou, X. Chen, M. Zhang, and A. Yan, QSAR study on the antimalarial activity of Plasmodium falciparum dihydroorotate dehydrogenase (PfDHODH) inhibitors, SAR QSAR Environ. Res. 27 (2016), pp. 101–124. doi:10.1080/1062936X.2015.1134652.
  • G.M. Keseru and G.M. Makara, Hit discovery and hit-to-lead approaches, Drug. Discov. Today 11 (2006), pp. 741–748. doi:10.1016/j.drudis.2006.06.016.
  • J.L. Medina-Franco, Activity cliffs: Facts or artifacts? Chem. Biol. Drug. Des. 81 (2013), pp. 553–556. doi:10.1111/cbdd.12115.
  • W. Klingspohn, M. Mathea, L.A. Ter, N. Heinrich, and K. Baumann, Efficiency of different measures for defining the applicability domain of classification models, J. Cheminform. 9 (2017), pp. 44. doi:10.1186/s13321-017-0230-2.
  • N. Aniceto, A.A. Freitas, A. Bender, and T. Ghafourian, A novel applicability domain technique for mapping predictive reliability across the chemical space of a QSAR: Reliability-density neighbourhood, J. Cheminform. 8 (2016), pp. 69. doi:10.1186/s13321-016-0182-y.
  • N. Fechner, G. Hinselmann, A. Jahn, and A. Zell, Kernel-based estimation of the applicability domain of QSAR models, J. Cheminform. 2 (2010), pp. 38. doi:10.1186/1758-2946-2-S1-P38.
  • D.S. Wishart, Y.D. Feunang, A.C. Guo, E.J. Lo, A. Marcu, J.R. Grant, T. Sajed, D. Johnson, C. Li, Z. Sayeeda, N. Assempour, I. Iynkkaran, Y. Liu, A. Maciejewski, N. Gale, A. Wilson, L. Chin, R. Cummings, D. Le, A. Pon, C. Knox, and M. Wilson, DrugBank 5.0: A major update to the DrugBank database for 2018, Nucleic. Acids. Res. 46 (2018), pp. D1074- D1082. doi:10.1093/nar/gkx1037.
  • P.F.J. Lipiński and P. Szurmak, SCRAMBLE’N’GAMBLE: A tool for fast and facile generation of random data for statistical evaluation of QSAR models, Chem. Zvesti. 71 (2017), pp. 2217–2232. doi:10.1007/s11696-017-0215-7.
  • D.A. Kumar, F. Mobeen, and A.U. Khan, Development of ligand and structure-based classification models to design novel inhibitors against antibiotic hydrolyzing enzymes: Integration of web server, J. Biomol. Struct. Dyn. 36 (2018), pp. 2966–2975. doi:10.1080/07391102.2017.1373034.
  • M.P. Naein and G.F. Cooper, Binary classifier calibration using an ensemble of piecewise linear regression models, Knowl. Inf. Syst. 54 (2017), pp. 151–170.
  • Y. Huang, P.M. Sullivan, and Z. Feng, Evaluating the predictiveness of a continuous marker, Biometrics 63 (2007), pp. 1181–1188. doi:10.1111/j.1541-0420.2007.00814.x.
  • M.S. Pepe, Z. Feng, Y. Huang, G. Longton, R. Prentice, I.M. Thompson, and Y. Zheng, Integrating the predictiveness of a marker with its performance as a classifier, Am. J. Epidemiol. 167 (2008), pp. 362–368. doi:10.1093/aje/kwm305.
  • S.A. Egieyeh, J. Syce, S.F. Malan, and A. Christoffels, Prioritization of antimalarial hits from nature: Chemo-informatic profiling of natural products with in vitro antiplasmodial activities and currently registered antimalarial drugs, Malar. J. 15 (2016), pp. 50. doi:10.1186/s12936-016-1087-y.
  • S. Egieyeh, J. Syce, S.F. Malan, and A. Christoffels, Predictive classifier models built from natural products with antimalarial bioactivity using machine learning approach, PLoS ONE 13 (2018), pp. e0204644. doi:10.1371/journal.pone.0204644.
  • N. Mahmoudi, J.V. de Julián-Ortiz, L. Ciceron, J. Gálvez, D. Mazier, M. Danis, F. Derouin, and R. García-Domenech, Identification of new antimalarial drugs by linear discriminant analysis and topological virtual screening, J. Antimicrob. Chemother. 57 (2006), pp. 489–497. doi:10.1093/jac/dki470.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.