496
Views
0
CrossRef citations to date
0
Altmetric
Opinion Piece

Artifical intelligence: a virtual chemist for natural product drug discovery

, , , &
Pages 3826-3835 | Received 16 Jan 2023, Accepted 12 May 2023, Published online: 26 May 2023

References

  • Ahrens, E. K. F. (1988). Chemical structures (pp. 97–111). Springer.
  • Aronson, A. R., & Lang, F. M. (2010). An overview of MetaMap: Historical perspective and recent advances. Journal of the American Medical Informatics Association: JAMIA, 17(3), 229–236. https://doi.org/10.1136/jamia.2009.002733
  • Aswad, M., Rayan, M., Abu-Lafi, S., Falah, M., Raiyn, J., Abdallah, Z., & Rayan, A. (2018). Nature is the best source of anti-inflammatory drugs: Indexing natural products for their anti-inflammatory bioactivity. Inflammation Research: Official Journal of the European Histamine Research Society, 67(1), 67–75. https://doi.org/10.1007/s00011-017-1096-5
  • Badal, V. D., Kundrotas, P. J., & Vakser, I. A. (2015). Text mining for protein docking. PLoS Computational Biology, 11(12), e1004630. https://doi.org/10.1371/journal.pcbi.1004630
  • Badal, V. D., Kundrotas, P. J., & Vakser, I. A. (2021). Text mining for modeling of protein complexes enhanced by machine learning. Bioinformatics (Oxford, England), 37(4), 497–505. https://doi.org/10.1093/bioinformatics/btaa823
  • Brown, N., Ertl, P., Lewis, R., Luksch, T., Reker, D., & Schneider, N. (2020). Artificial intelligence in chemistry and drug design. Journal of Computer-Aided Molecular Design, 34(7), 709–715. https://doi.org/10.1007/s10822-020-00317-x
  • Capecchi, A., Probst, D., & Reymond, J. L. (2020). One molecular fingerprint to rule them all: Drugs, biomolecules, and the metabolome. Journal of Cheminformatics, 12, 43.
  • Carhart, R. E., D. H. Smith, D. H., & Venkataraghavan, R. (1985). Atom pairs as molecular features in structure-activity studies: Definition and applications. Journal of Chemical Information and Computer Sciences, 25(2), 64–73. https://doi.org/10.1021/ci00046a002
  • Christie, B. D., Leland, B. A., & Nourse, J. G. (1993). Structure searching in chemical databases by direct lookup methods. Journal of Chemical Information and Computer Sciences, 33(4), 545–547. https://doi.org/10.1021/ci00014a004
  • Corley, D. G., & Durley, R. C. (1994). Strategies for database dereplication of natural products. Journal of Natural Products, 57(11), 1484–1490. https://doi.org/10.1021/np50113a002
  • Durant, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences, 42(6), 1273–1280. https://doi.org/10.1021/ci010132r
  • Egieyeh, S., Syce, J., Malan, S. F., & Christoffels, A. (2018). Predictive classifier models built from natural products with antimalarial bioactivity using machine learning approach. PloS One, 13(9), e0204644. https://doi.org/10.1371/journal.pone.0204644
  • Friedrich, L., Cingolani, G., Ko, Y.-H., Iaselli, M., Miciaccia, M., Perrone, M. G., Neukirch, K., Bobinger, V., Merk, D., Hofstetter, R. K., Werz, O., Koeberle, A., Scilimati, A., & Schneider, G. (2021). Learning from nature: From a marine natural product to synthetic cyclooxygenase-1 inhibitors by automated de novo design. Advanced Science. 8(16), 2100832. https://doi.org/10.1002/advs.202100832
  • Galvez-Llompart, M., Del Carmen Recio Iglesias, M., Gálvez, J., & García-Domenech, R. (2013). Novel potential agents for ulcerative colitis by molecular topology: Suppression of IL-6 production in Caco-2 and RAW 264.7 cell lines. Molecular Diversity, 17(3), 573–593. https://doi.org/10.1007/s11030-013-9458-6
  • Galvez-Llompart, M., Zanni, R., & García-Domenech, R. (2011). Modeling natural anti-inflammatory compounds by molecular topology. International Journal of Molecular Sciences, 12(12), 9481–9503. https://doi.org/10.3390/ijms12129481
  • García-Domenech, R., Zanni, R., Galvez-Llompart, M., & de Julián-Ortiz, J. V. (2013). Modeling anti-allergic natural compounds by molecular topology. Combinatorial Chemistry & High Throughput Screening, 16(8), 628–635. https://doi.org/10.2174/1386207311316080005
  • Giordanetto, F., & Kihlberg, J. (2014). Macrocyclic drugs and clinical candidates: What can medicinal chemists learn from their properties? Journal of Medicinal Chemistry, 57(2), 278–295. https://doi.org/10.1021/jm400887j
  • Heller, S. R., McNaught, A., Pletnev, I., Stein, S., & Tchekhovskoi, D. (2015). InChI, the IUPAC international chemical identifier. Journal of Cheminformatics, 7, 23.
  • Jeon, J., Kang, S., & Kim, H. U. (2021). Predicting biochemical and physiological effects of natural products from molecular structures using machine learning. Natural Product Reports, 38(11), 1954–1966. https://doi.org/10.1039/d1np00016k
  • Johnston, C. W., Skinnider, M. A., Wyatt, M. A., Li, X., Ranieri, M. R. M., Yang, L., Zechel, D. L., Ma, B., & Magarvey, N. A. (2015). An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products. Nature Communications, 6, 8421. https://doi.org/10.1038/ncomms9421
  • Kleandrova, V. V., & Speck-Planche, A. (2022). PTML modeling for pancreatic cancer research: In silico design of simultaneous multi-protein and multi-cell inhibitors. Biomedicines, 10(2), 491. https://doi.org/10.3390/biomedicines10020491
  • Kleandrova, V. V., Rojas-Vargas, J. A., Scotti, M. T., & Speck-Planche, A. (2022). PTML modeling for peptide discovery: In silico design of non-hemolytic peptides with antihypertensive activity. Molecular Diversity, 26(5), 2523–2534. https://doi.org/10.1007/s11030-021-10350-z
  • Kleandrova, V. V., Scotti, M. T., & Speck-Planche, A. (2021). Computational drug repurposing for antituberculosis therapy: Discovery of multi-strain inhibitors. Antibiotics (Basel), 10(8), 1005. https://doi.org/10.3390/antibiotics10081005
  • Kleandrova, V. V., Scotti, M. T., Scotti, L., & Speck-Planche, A. (2021). Multi-target drug discovery via PTML modeling: Applications to the design of virtual dual inhibitors of CDK4 and HER2. Current Topics in Medicinal Chemistry, 21(7), 661–675. https://doi.org/10.2174/1568026621666210119112845
  • Kleandrova, V. V., Scotti, M. T., Scotti, L., Nayarisseri, A., & Speck-Planche, A. (2020). Cell-based multi-target QSAR model for design of virtual versatile inhibitors of liver cancer cell lines. SAR and QSAR in Environmental Research, 31(11), 815–836. https://doi.org/10.1080/1062936X.2020.1818617
  • Krallinger, M., Leitner, F., & Valencia, A. (2010). Bioinformatics methods in clinical research R (pp. 341–382.). Matthiesen Humana Press.
  • Krallinger, M., Rabal, O., Lourenço, A., Oyarzabal, J., & Valencia, A. (2017). Information retrieval and text mining technologies for chemistry. Chemical Reviews, 117(12), 7673–7761. https://doi.org/10.1021/acs.chemrev.6b00851
  • Krenn, M., Häse, F., Nigam, A., Friederich, P., & Aspuru-Guzik, A. (2019). Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representation. arXiv 1905:13741.
  • Liu, Z., Huang, D., Zheng, S., Song, Y., Liu, B., Sun, J., Niu, Z., Gu, Q., Xu, J., & Xie, L. (2021). Deep learning enables discovery of highly potent anti-osteoporosis natural products. European Journal of Medicinal Chemistry, 210, 112982. https://doi.org/10.1016/j.ejmech.2020.112982
  • Masalha, M., Rayan, M., Adawi, A., Abdallah, Z., & Rayan, A. (2018). Capturing antibacterial natural products with in silico techniques. Molecular Medicine Reports. 18, 763–770.
  • May, B. H., Zhang, A., Lu, Y., Lu, C., & Xue, C. C. L. (2014). The systematic assessment of traditional evidence from the premodern Chinese medical literature: A text-mining approach. Journal of Alternative and Complementary Medicine (New York, N.Y.), 20(12), 937–942. https://doi.org/10.1089/acm.2013.0372
  • Merk, D., Grisoni, F., Friedrich, L., & Schneider, G. (2018). Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Communications Chemistry, 1(1), 68. https://doi.org/10.1038/s42004-018-0068-1
  • Merk, D., Grisoni, F., Friedrich, L., Gelzinyte, E., & Schneider, G. (2018). Scaffold hopping from synthetic RXR modulators by virtual screening and de novo design. MedChemComm, 9(8), 1289–1292. https://doi.org/10.1039/c8md00134k
  • Newman, D. J., & Cragg, G. M. (2020). Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. Journal of Natural Products, 83(3), 770–803. https://doi.org/10.1021/acs.jnatprod.9b01285
  • Nocedo-Mena, D., Cornelio, C., Camacho-Corona, M. D. R., Garza-González, E., Waksman de Torres, N., Arrasate, S., Sotomayor, N., Lete, E., & González-Díaz, H. (2019). Modeling antibacterial activity with machine learning and fusion of chemical structure information with microorganism metabolic networks. Journal of Chemical Information and Modeling, 59(3), 1109–1120. 25 https://doi.org/10.1021/acs.jcim.9b00034
  • Nugroho, A. E., & Morita, H. J. (2019). Computationally-assisted discovery and structure elucidation of natural products. Journal of Natural Medicines, 73(4), 687–695. https://doi.org/10.1007/s11418-019-01321-8
  • O'Boyle, N., & Dalke, A. (2018). Deep SMILES: An adaptation of SMILES for use in machine-learning of chemical structures. ChemRxiv, https://doi.org/10.26434/chemrxiv.7097960.v1
  • Öztürk, H., Özgür, A., Schwaller, P., Laino, T., & Ozkirimli, E. (2020). Exploring chemical space using natural language processing methodologies for drug discovery. Drug Discovery Today, 25(4), 689–705. https://doi.org/10.1016/j.drudis.2020.01.020
  • Papanikolaou, N., Pavlopoulos, G. A., Theodosiou, T., & Iliopoulos, I. (2015). Protein-protein interaction predictions using text mining methods. Methods (San Diego, Calif.), 74, 47–53. https://doi.org/10.1016/j.ymeth.2014.10.026
  • Pereira, F., Latino, D. A. R. S., & Gaudêncio, S. P. (2014). A chemoinformatics approach to the discovery of lead-like molecules from marine and microbial sources en route to antitumor and antibiotic drugs. Mar Drugs, 12(12), 757–778. https://doi.org/10.3390/md12020757
  • Pereira, F., Latino, D. A. R. S., & Gaudêncio, S. P. (2015). QSAR-assisted virtual screening of lead-like molecules from marine and microbial natural sources for antitumor and antibiotic drug discovery. Molecules (Basel, Switzerland), 20(3), 4848–4873. https://doi.org/10.3390/molecules20034848
  • Periwal, V., Bassler, S., Andrejev, S., Gabrielli, N., Patil, K. R., Typas, A., & Patil, K. R. (2022). Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs. PLoS Computational Biology, 18(4), e1010029. https://doi.org/10.1371/journal.pcbi.1010029
  • Plisson, F., & Piggott, A. M. (2019). Predicting blood-brain barrier permeability of marine-derived kinase inhibitors using ensemble classifiers reveals potential hits for neurodegenerative disorders. Mar Drugs, 17(2), 81.
  • Rajan, K., Zielesny, A., & Steinbeck, C. (2020). DECIMER: Towards deep learning for chemical image recognition. Journal of Cheminformatics, 12, 65.
  • Rayan, A., Raiyn, J., & Falah, M. (2017). Nature is the best source of anticancer drugs: Indexing natural products for their anticancer bioactivity. PloS One, 12(11), e0187925. https://doi.org/10.1371/journal.pone.0187925
  • Rayan, M., Abdallah, Z., Abu-Lafi, S., Masalha, M., & Rayan, A. (2019). Indexing natural products for their antifungal activity by filters-based approach: Disclosure of discriminative properties. Current Computer-Aided Drug Design, 15(3), 235–242. https://doi.org/10.2174/1573409914666181017100532
  • Rebholz-Schuhmann, D., Oellrich, A., & Hoehndorf, R. (2012). Text-mining solutions for biomedical research: Enabling integrative biology. Nature Reviews. Genetics, 13(12), 829–839. https://doi.org/10.1038/nrg3337
  • Rodrigues, T. (2017). Harnessing the potential of natural products in drug discovery from a cheminformatics vantage point. Organic & Biomolecular Chemistry, 15(44), 9275–9282. https://doi.org/10.1039/c7ob02193c
  • Rogers, D., & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50(5), 742–754. https://doi.org/10.1021/ci100050t
  • Rupp, M., Schroeter, T., Steri, R., Zettl, H., Proschak, E., Hansen, K., Rau, O., Schwarz, O., Müller-Kuhrt, L., Schubert-Zsilavecz, M., Müller, K. R., & Schneider, G. (2010). From machine learning to natural product derivatives that selectively activate transcription factor PPARgamma. ChemMedChem. 5(2), 191–194. https://doi.org/10.1002/cmdc.200900469
  • Rutz, A., Sorokina, M., Galgonek, J., Mietchen, D., Willighagen, E., Gaudry, A., Graham, J. G., Stephan, R., Page, R., Vondrášek, J., Steinbeck, C., Pauli, G. F., Wolfender, J. L., Bisson, J., & Allard, P. M. (2022). The LOTUS initiative for open knowledge management in natural products research. eLife, 11, e70780. https://doi.org/10.7554/eLife.70780
  • Sald’ıvar-Gonz’alez, F. I., Aldas-Bulos, V. D., Medina-Franco, J. L., & Plisso, F. (2022). Natural product drug discovery in the artificial intelligence era. Chemical Science, 13(6), 1526–1546. https://doi.org/10.1039/D1SC04471K
  • Santana, R., Zuluaga, R., Gañán, P., Arrasate, S., Onieva, E., Montemore, M. M., & González-Díaz, H. (2020). PTML model for selection of nanoparticles, anticancer drugs, and vitamins in the design of drug–vitamin nanoparticle release systems for cancer cotherapy. Molecular Pharmaceutics, 17(7), 2612–2627. https://doi.org/10.1021/acs.molpharmaceut.0c00308
  • Schneider, P., Walters, W. P., Plowright, A. T., Sieroka, N., Listgarten, J., Goodnow, R. A., Fisher, J., Jansen, J. M., Duca, J. S., Rush, T. S., Zentgraf, M., Hill, J. E., Krutoholow, E., Kohler, M., Blaney, J., Funatsu, K., Luebkemann, C., & Schneider, G. (2020). Rethinking drug design in the artificial intelligence era. Nature Reviews. Drug Discovery, 19(5), 353–364. https://doi.org/10.1038/s41573-019-0050-3
  • Seo, M., Shin, H. K., Myung, Y., Hwang, S., & No, K. T. (2020). Development of Natural Compound Molecular Fingerprint (NC-MFP) with the Dictionary of Natural Products (DNP) for natural product-based drug development. Journal of Cheminformatics, 12, 6.
  • Shergis, J. L., Wu, L., May, B. H., Zhang, A. L., Guo, X., Lu, C., & Xue, C. C. (2015). Natural products for chronic cough: Text mining the East Asian historical literature for future therapeutics. Chronic Respiratory Disease, 12(3), 204–211. https://doi.org/10.1177/1479972315583043X
  • Speck-Planche, A., & Kleandrova, V. V. (2022). Multi-condition QSAR model for the virtual design of chemicals with dual pan-antiviral and anti-cytokine storm profiles. ACS Omega, 7(36), 32119–32130. https://doi.org/10.1021/acsomega.2c03363
  • Speck-Planche, A., & Scotti, M. T. (2019). BET bromodomain inhibitors: Fragment-based in silico design using multi-target QSAR models. Molecular Diversity, 23(3), 555–572. https://doi.org/10.1007/s11030-018-9890-8
  • Speck-Planche, A., Kleandrova, V. V., & Scotti, M. T. (2021). In silico drug repurposing for anti-inflammatory therapy: Virtual search for dual inhibitors of caspase-1 and TNF-alpha. Biomolecules, 11(12), 1832. 4 https://doi.org/10.3390/biom11121832
  • Speck-Planche, A., Natália, M., & Cordeiro, D. S. (2017). De novo computational design of compounds virtually displaying potent antibacterial activity and desirable in vitro ADMET profiles. Medicinal Chemistry Research, 26(10), 2345–2356. https://doi.org/10.1007/s00044-017-1936-4
  • Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., & Willighagen, E. (2003). The chemistry development kit (CDK): An open-source Java library for chemo- and bioinformatics. Journal of Chemical Information and Computer Sciences, 43(2), 493–500. https://doi.org/10.1021/ci025584y
  • Tenorio-Borroto, E., Castañedo, N., García-Mera, X., Rivadeneira, K., Chagoyán, J. C. V., Pliego, A. B., Munteanu, C. R., & González-Díaz, H. (2019). Perturbation theory machine learning modeling of immunotoxicity for drugs targeting inflammatory cytokines and study of the antimicrobial G1 using cytometric bead arrays. Chemical Research in Toxicology, 32(9), 1811–1823. https://doi.org/10.1021/acs.chemrestox.9b00154
  • Tsai, F. S. (2011). Text mining and visualisation of Protein-Protein Interactions. International Journal of Computational Biology and Drug Design, 4(3), 239–244. https://doi.org/10.1504/IJCBDD.2011.041412
  • Tshitoyan, V., Dagdelen, J., Weston, L., Dunn, A., Rong, Z., Kononova, O., Persson, K. A., Ceder, G., & Jain, A. (2019). Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 571(7763), 95–98. https://doi.org/10.1038/s41586-019-1335-8
  • Urista, D. V., Carrué, D. B., Otero, I., Arrasate, S., Quevedo-Tumailli, V. F., Gestal, M., González-Díaz, H., & Munteanu, C. R. (2020). Prediction of antimalarial drug-decorated nanoparticle delivery systems with random forest models. Biology (Basel), 9(8), 198. 30 https://doi.org/10.3390/biology9080198
  • Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi, A., Shah, P., Spitzer, M., & Zhao, S. (2019). Applications of machine learning in drug discovery and development. Nature Reviews. Drug Discovery, 18(6), 463–477. https://doi.org/10.1038/s41573-019-0024-5
  • Walters, W. P., & Murcko, M. (2020). Assessing the impact of generative AI on medicinal chemistry. Nature Biotechnology, 38(2), 143–145. https://doi.org/10.1038/s41587-020-0418-2
  • Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1), 31–36. https://doi.org/10.1021/ci00057a005
  • Yoo, S., Yang, H. C., Lee, S., Shin, J., Min, S., Lee, E., Song, M., & Lee, D. (2020). A deep learning-based approach for identifying the medicinal uses of plant-derived natural compounds. Frontiers in Pharmacology, 11, 584875. https://doi.org/10.3389/fphar.2020.584875
  • Zeidan, M., Rayan, M., Zeidan, N., Falah, M., & Rayan, A. (2017). Indexing natural products for their potential anti-diabetic activity: Filtering and mapping discriminative physicochemical properties. Molecules, 22(9), 1563. https://doi.org/10.3390/molecules22091563
  • Zhang, R., Li, X., Zhang, X., Qin, H., & Xiao, W. (2021). Machine learning approaches for elucidating the biological effects of natural products. Natural Product Reports, 38(2), 346–361. https://doi.org/10.1039/d0np00043d
  • Zhang, W., Pei, J., & Lai, L. (2017). Statistical analysis and prediction of covalent ligand targeted cysteine residues. Journal of Chemical Information and Modeling, 57(3), 403–412. https://doi.org/10.1021/acs.jcim.6b00491
  • Zhavoronkov, A., Ivanenkov, Y. A., Aliper, A., Veselov, M. S., Aladinskiy, V. A., Aladinskaya, A. V., Terentiev, V. A., Polykovskiy, D. A., Kuznetsov, M. D., Asadulaev, A., Volkov, Y., Zholus, A., Shayakhmetov, R. R., Zhebrak, A., Minaeva, L. I., Zagribelnyy, B. A., Lee, L. H., Soll, R., Madge, D., … Aspuru-Guzik, A. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology, 37(9), 1038–1040. https://doi.org/10.1038/s41587-019-0224-x

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.