1,644
Views
11
CrossRef citations to date
0
Altmetric
Review

Knowledge graphs and their applications in drug discovery

ORCID Icon
Pages 1057-1069 | Received 10 Jan 2021, Accepted 26 Mar 2021, Published online: 12 Apr 2021

References

  • “Total global pharmaceutical RD spending 2012–2026,”. [cited 2021 Jul 03]. Available from: https://www.statista.com/statistics/309466/global-r-and-d-expenditure-forpharmaceuticals
  • “2020 FDA drug approvals,”. [cited 2021 Jul 03]. Available from: https://www.nature.com/articles/d41573-021-00002-0
  • “Ten years on: measuring the return from pharmaceutical innovation 2019,”. [cited 2021 Jul 03]. Available from: https://www2.deloitte.com/us/en/pages/life-sciences-andhealth-care/articles/measuring-return-from-pharmaceutical-innovation.html
  • Collins FS, Morgan M, Patrinos A. The human genome project: lessons from large-scale biology. Science. 2003;300(5617):286–290.
  • 1000 G. P. Consortiumet al. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061.
  • Canela-Xandri O, Rawlik K, Tenesa A. An atlas of genetic associations in UK biobank. Nat Genet. 2018;50(11):1593–1599.
  • Leinonen R, Sugawara H, Shumway M, et al. The sequence read archive. Nucleic Acids Res. 2010;39(suppl 1):D19–D21.
  • Leinonen R, Akhtar R, Birney E, et al. The european nucleotide archive. Nucleic Acids Res. 2010;39(suppl 1):D28–D31.
  • Ponten F, Jirstrom K, Uhlen M. The human protein atlas—a tool for pathology. J Pathol. 2008;216(4):387–393.
  • GTEx Consortium. The genotype-tissue expression (gtex) pilot analysis: multitissue gene regulation in humans. Science. 2015;348(6235):648–660.
  • Stathias V, Turner J, Koleti A, et al. Lincs data portal 2.0: next generation access point for perturbation-response signatures. Nucleic Acids Res. 2020;48(D1):D431–D439.
  • Tomczak K, Czerwinska P, Wiznerowicz M. The cancer genome atlas (tcga): an immeasurable source of knowledge. Contemp Oncol. 2015;19(1A):A68.
  • Ghandi M, Huang FW, Jane-Valbuena J, et al. Next-generation characterization of the cancer cell line encyclopedia. Nature. 2019;569(7757):503–508.
  • Tsherniak A, Vazquez F, Montgomery PG, et al. Defining a cancer dependency map. Cell. 2017;170(3):564–576.
  • Chadwick LH. The NIH roadmap epigenomics program data resource. Epigenomics. 2012;4(3):317–324.
  • Kozomara A, Griffiths-Jones S. MiRBase: integrating microrna annotation and deep-sequencing data. Nucleic Acids Res. 2010;39(suppl 1):D152–D157.
  • Volders P-J, Helsens K, Wang X, et al. Lncipedia: a database for annotated human lncrna transcript sequences and structures. Nucleic Acids Res. 2013;41(D1):D246–D251.
  • Cui T, Zhang L, Huang Y, et al. Mndr v2. 0: an updated resource of ncrna–disease associations in mammals. Nucleic Acids Res. 2018;46(D1):D371–D374.
  • Earm K, Earm YE. Integrative approach in the era of failing drug discovery and development. Integr Med Res. 2014;3(4):211–216.
  • Rago L, Santoso B. “Drug regulation: history, present and future,” ¨. Drug Benefit Risks. 2008;2:65–77.
  • “Novartis CEO who wanted to bring tech into pharma now explains why it’s so hard,”. [cited 2020 Sep 30]. Available from: https://www.forbes.com/sites/davidshaywitz/2019/01/16/novartis-ceo-who-wanted-to-bring-tech-into-pharma-now-explains-why-its-so-hard, accessed: 2020-september-30.
  • Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The fair guiding principles for scientific data management and stewardship. Sci Data. 2016;3(1):1–9.
  • Iams WT, Lovly CM. Molecular pathways: clinical applications and future direction of insulin-like growth factor-1 receptor pathway blockade. Clin Cancer Res. 2015;21(19):4270–4277.
  • Rossi A, Firmani D, Matinata A, et al. Knowledge graph embedding for link prediction: a comparative analysis. arXiv Preprint arXiv:2002 00819. 2020.
  • Zou X. A survey on application of knowledge graph. JPhCS. 2020;1487(1):012016.
  • Gao Y, Li Y-F, Lin Y, et al. Deep learning on knowledge graph for recommender system: a survey. arXiv Preprint arXiv:2004 00387. 2020.
  • “Neo4j graph database. [cited 2021 Sep 12]. Available from: https://neo4j.com
  • Himmelstein, DS, Lizee, A, Hessler, C, et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. Elife. 2017;6:e26726.
  • Himmelstein DS, Baranzini SE. Heterogeneous network edge prediction: a data integration approach to prioritize disease-associated genes. PLoS Comput Biol. 2015;11(7):e1004259.
  • Breit A, Ott S, Agibetov A, et al. OpenBioLink: a benchmarking framework for large-scale biomedical link prediction. arXiv Preprint arXiv:1912 04616. 2019.
  • Womack F, McClelland J, Koslicki D. Leveraging distributed biomedical knowledge sources to discover novel uses for known drugs. bioRxiv. 2019;765305.
  • Percha B, Altman, RB. A global network of biomedical relationships derived from text. Bioinformatics. 2018;34(15):2614–2624.
  • Ioannidis VN, Song X, Manchanda S, et al. Drkg-drug repurposing knowledge graph for COVID-19. arXiv. 2020.
  • Belleau F, Nolin M-A, Tourigny N, et al. Bio2rdf: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform. 2008;41(5):706–716.
  • Chen B, Dong X, Jiao D, et al. Chem2bio2rdf: a semantic framework for linking and data mining chemogenomic and systems chemical biology data. BMC Bioinformatics. 2010;11(1):255.
  • Yue, X, Wang, Z, Huang, J, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics. 2020;36(4):1241–1251.
  • Gao F, Musial K, Cooper C, et al. Link prediction methods and their accuracy for different social networks and network metrics. Sci Programm. 2015;2015:1–13.
  • Cai H, Zheng VW, Chang K-C-C. A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowledge Data Eng. 2018;30(9):1616–1637.
  • Xia X, “Knowledge Graph Embedding Methodologies,”. [cited 2020 Jul 03]. Available from: https://github.com/xinguoxia/KGE#methodologies
  • Hodos, RA, Kidd, BA, Khader, S, et al. Computational approaches to drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med. 2016;8(3):186.
  • Talevi A, Bellera CL. Challenges and opportunities with drug repurposing: finding strategies to find alternative uses of therapeutics. Expert Opin Drug Discov. 2020;15(4):397–401.
  • Wang L, Lei Y, Gao Y, et al. Association of finasteride with prostate cancer: a systematic review and meta-analysis. Medicine (Baltimore). 2020;99(15):e19486.
  • Jain P, Jain SK, Jain M. Harnessing drug repurposing for exploration of new diseases: an insight to strategies and case studies. Curr Mol Med. 2020;20. DOI:https://doi.org/10.2174/1566524020666200619125404
  • Ganzer CA, Jacobs AR, Iqbal F. Persistent sexual, emotional, and cognitive impairment post-finasteride: a survey of men reporting symptoms. Am J Men’s Health. 2015;9(3):222–228.
  • Poleksic A. Overcoming sparseness of biomedical networks to identify drug repositioning candidates. bioRxiv. 2020.
  • Sosa DN, Derry A, Guo M, et al. A literature-based knowledge graph embedding method for identifying drug repurposing opportunities in rare diseases. bioRxiv. 2019;727925.
  • Xu B, Liu Y, Yu S, et al. A network embedding model for pathogenic genes prediction by multi-path random walking on heterogeneous network. BMC Med Genomics. 2019;12(10):188.
  • Gaudelet T, Day B, Jamasb AR, et al. Utilising graph machine learning within drug discovery and development. arXiv Preprint arXiv:2012 05716. 2020.
  • Paliwal S, De Giorgio A, Neil D, et al. Preclinical validation of therapeutic targets predicted by tensor factorization on heterogeneous graphs. Sci Rep. 2020;10(1):1–19.
  • Amaral PP, Dinger ME, Mattick JS. Non-coding rnas in homeostasis, disease and stress responses: an evolutionary perspective. Brief Funct Genomics. 2013;12(3):254–278.
  • Ji B-Y, You Z-H, Cheng L, et al. Predicting mirna-disease association from heterogeneous information network with grarep embedding model. Sci Rep. 2020;10(1):1–12.
  • Zhou J-R, You Z-H, Cheng L, et al. Prediction of lncrna–disease associations via an embedding learning hope in heterogeneous information networks. Mol Ther Nucleic Acids. 2020;23:277-285.
  • Zheng Y, Peng H, Zhang X, et al. Old drug repositioning and new drug discovery through similarity learning from drug-target joint feature spaces. BMC Bioinformatics. 2019;20(23):605.
  • Luo Y, Zhao X, Zhou J, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun. 2017;8(1):1–13.
  • Lim H, Gray P, Xie L, et al. Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem. Sci Rep. 2016;6(1):1–11.
  • Ba-Alawi W, Soufan O, Essack M, et al. Daspfind: new efficient method to predict drug–target interactions. J Cheminform. 2016;8(1):15.
  • Mizutani S, Pauwels E, Stoven V, et al. Relating drug–protein interaction network with drug side effects. Bioinformatics. 2012;28(18):i522–i528.
  • Wan F, Hong L, Xiao A, et al. Neodti: neural integration of neighbor information from a heterogeneous network for discovering new drug–target interactions. Bioinformatics. 2019;35(1):104–111.
  • Huang K, Fu T, Xiao C, et al. Deeppurpose: a deep learning based drug repurposing toolkit. arXiv Preprint arXiv:2004 08919. 2020.
  • Wallach I, Dzamba M, Heifets A. Atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv Preprint arXiv:1510 02855. 2015.
  • Senior AW, Evans R, Jumper J, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577(7792):706–710.
  • Reese JT, Unni DR, Callahan TJ, et al. KG-COVID-19: a framework to produce customized knowledge graphs for covid-19 response. Patterns. 2020;2(1):100155.
  • Zhou Y, Hou Y, Shen J, et al. Network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov 2. Cell Discov. 2020;6(1):1–18.
  • Wang LL, Lo K, Chandrasekhar Y, et al. Cord-19: the covid-19 open research dataset. ArXiv. 2020.
  • Hsieh K, Wang Y, Chen L, et al. Drug repurposing for covid-19 using graph neural network with genetic, mechanistic, and epidemiological validation. arXiv Preprint arXiv:2009 10931. 2020.
  • Gysi DM, Valle ID, Zitnik M, et al. Network medicine framework for identifying drug repurposing opportunities for covid-19. arXiv Preprint arXiv:2004 07229. 2020.
  • Gasmi A, Tippairote T, Mujawdiya PK, et al. Neurological involvements of sars-cov2 infection. Mol Neurobiol. 202
  • Stebbing J, Phelan A, Griffin I, et al. Covid-19: combining antiviral and anti-inflammatory treatments. Lancet Infect Dis. 2020;20(4):400–402.
  • “Baricitinib receives emergency use authorization from the FDA for the treatment of hospitalized patients with COVID-19,”. [cited 2021 Jan 02]. Available from: https://investor.lilly.com/news-releases/news-release-details/baricitinib-receives-emergency-use-authorization-fda-treatment
  • Kuchaiev O, Rasajski M, Higham DJ, et al. Geometric de-noising of protein-protein interaction networks. PLoS Comput Biol. 2009;5(8):e1000454.
  • Xiao Z, Deng Y. Graph embedding-based novel protein interaction prediction via higher-order graph convolutional network. PloS One. 2020;15(9):e0238915.
  • Yang F, Fan K, Song D, et al. Graph-based prediction of protein-protein interactions with attributed signed graph embedding. BMC Bioinformatics. 2020;21(1):1–16.
  • Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics. 2018;34(13):i457–i466.
  • Lim H, Poleksic A, Xie L. Exploring landscape of drug-target-pathway-side effect associations. AMIA Summits Translat Sci Proceed. 2018:132–141.
  • Zhang W, Chen Y, Liu F, et al. Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data. BMC Bioinformatics. 2017;18(1):18.
  • Su C, Tong J, Zhu Y, et al. Network embedding in biomedical data science. Brief Bioinform. 2020;21(1):182–197.
  • Sangar V, Blankenberg DJ, Altman N, et al. Quantitative sequence-function relationships in proteins based on gene ontology. BMC Bioinformatics. 2007;8(1):294.
  • Grover A, Leskovec J, “node2vec: scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, USA. pp. 855–864.
  • Stark C, Breitkreutz B-J, Reguly T, et al. Biogrid: a general repository for interaction datasets. Nucleic Acids Res. 2006;34(suppl 1):D535–D539.
  • Kulmanov M, Khan MA, Hoehndorf R. Deepgo: predicting protein functions from sequence and interactions using a deep ontology aware classifier. Bioinformatics. 2018;34(4):660–668.
  • Nariai N, Kolaczyk ED, Kasif S. Probabilistic protein function prediction from heterogeneous genome-wide data. Plos One. 2007;2(3):e337.
  • Makrodimitris S, Van Ham RC, Reinders MJ. Automatic gene function prediction in the 2020’s. Genes (Basel). 2020;11(11):1264.
  • Goymer P. Why do we need hubs? Nat Rev Genet. 2008;9(9):651.
  • Chen S-J, Liao D-L, Chen C-H, et al. Construction and analysis of protein-protein interaction network of heroin use disorder. Sci Rep. 2019;9(1):1–9.
  • Dai W, Chang Q, Peng W, et al. Network embedding the protein–protein interaction network for human essential genes identification. Genes (Basel). 2020;11(2):153.
  • Lefranc F, Tabanca N, Kiss R. Assessing the anticancer effects associated with food products and/or nutraceuticals using in vitro and in vivo preclinical development-related pharmacological tests. In: Seminars in cancer biology. Vol. 46. Elsevier; 2017. p. 14–32.
  • Veselkov K, Gonzalez G, Aljifri S, et al. Hyperfoods: machine intelligent mapping of cancer-beating molecules in foods. Sci Rep. 2019;9(1):1–12.
  • Du J, Jia P, Dai Y, et al. Gene2vec: distributed representation of genes based on co-expression. BMC Genomics. 2019;20(1):7–15.
  • Goh K-I, Cusick ME, Valle D, et al., “The human disease network,” Proceedings of the National Academy of Sciences, vol. 104, no. 21, pp. 8685–8690, 2007, USA.
  • Cantini L, Medico E, Fortunato S, et al. Detection of gene communities in multi-networks reveals cancer drivers. Sci Rep. 2015;5(1):17386.
  • Zietz M, Himmelstein DS, Kloster K, et al. The probability of edge existence due to node degree: a baseline for network-based predictions. Manubot, Tech Rep. 2020.
  • Wishart DS, Knox C, Guo AC, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008;36(suppl1):D901–D906.
  • Avram S, Bologa CG, Holmes J, et al. DrugCentral 2021 supports drug discovery and repositioning. Nucleic Acids Res. 2021;49(D1):D1160–D1169.
  • Edwards AM, Isserlin R, Bader GD, et al. Too many roads not taken. Nature. 2011;470(7333):163–165.
  • Oprea TI, Bologa CG, Brunak S, et al. Unexplored therapeutic opportunities in the human genome. Nat Rev Drug Discov. 2018;17(5):317.
  • Hutchison CA, Chuang R-Y, Noskov VN, et al. Design and synthesis of a minimal bacterial genome. Science. 2016;351(6280):6280.
  • Feng R, Yang Y, Hu W, et al. Representation learning for scale-free networks. arXiv Preprint arXiv:1711 10755. 2017.
  • Kang B, Lijffijt J, Bie TD. Conditional network embeddings. arXiv Preprint arXiv:1805 07544. 2018.
  • Buyl M, De Bie T. Debayes: a bayesian method for debiasing network embeddings. arXiv Preprint arXiv:2002 11442. 2020.
  • Lerer A, Wu L, Shen J, et al. Pytorch-biggraph: a large-scale graph embedding system. arXiv Preprint arXiv:1903 12287. 2019.
  • Zheng D, Song X, Ma C, et al. Dgl-ke: training knowledge graph embeddings at scale. arXiv Preprint arXiv:2004 08532. 2020.
  • Hamilton WL, Bajaj P, Zitnik M, et al. Embedding logical queries on knowledge graphs. arXiv Preprint arXiv:1806 01445. 2018.
  • Lee B, Zhang S, Poleksic A, et al. Heterogeneous multi-layered network model for omics data integration and analysis. Front Genet. 2020;10:1381.
  • Lin XV, Socher R, Xiong C. Multi-hop knowledge graph reasoning with reward shaping. arXiv Preprint arXiv:1808 10568. 2018.
  • Bishop JM. Artificial intelligence is stupid and causal reasoning won’t fix it. arXiv Preprint arXiv:2008 07371. 2020.
  • Liu A, Trairatphisan P, Gjerga E, et al. From expression footprints to causal pathways: contextualizing large signaling networks with carnival. NPJ Syst Biol Appl. 2019;5(1):1–10.
  • Rivas-Barragan D, Mubeen S, Guim-Bernat F, et al. Drug2ways: reasoning over causal paths in biological networks for drug discovery. bioRxiv. 2020.
  • Vidal M, Cusick ME, Barabasi A-L. Interactome networks and human disease. Cell. 2011;144(6):986–998.
  • Broido AD, Clauset A. Scale-free networks are rare. Nat Commun. 2019;10(1):1–10.
  • Dorogovtsev S, Mendes J, Samukhin A. Generic scale of the” scale-free” growing networks. arXiv Preprint Cond-mat/0011115. 2000.
  • Rohani N, Eslahchi C. Drug-drug interaction predicting by neural network using integrated similarity. Sci Rep. 2019;9(1):1–11.
  • Wouters OJ, McKee M, Luyten J. Estimated research and development investment needed to bring a new medicine to market, 2009–2018. Jama. 2020;323(9):844–853.
  • Mohs RC, Greig NH. Drug discovery and development: role of basic biological research. Alzheimers Dementia. 2017;3(4):651–657.
  • Xue S, Lu J, Zhang G. Cross-domain network representations. Pattern Recogn. 2019;94:135–148.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.