578
Views
1
CrossRef citations to date
0
Altmetric
Review

Machine Learning and Artificial Intelligence for the Prediction of Host–Pathogen Interactions: A Viral Case

ORCID Icon
Pages 3319-3326 | Published online: 20 Aug 2021

Abstract

The research of interactions between the pathogens and their hosts is key for understanding the biology of infection. Commencing on the level of individual molecules, these interactions define the behavior of infectious agents and the outcomes they elicit. Discovery of host–pathogen interactions (HPIs) conventionally involves a stepwise laborious research process. Yet, amid the global pandemic the urge for rapid discovery acceleration through the novel computational methodologies has become ever so poignant. This review explores the challenges of HPI discovery and investigates the efforts currently undertaken to apply the latest machine learning (ML) and artificial intelligence (AI) methodologies to this field. This includes applications to molecular and genetic data, as well as image and language data. Furthermore, a number of breakthroughs, obstacles, along with prospects of AI for host–pathogen interactions (HPI), are discussed.

Introduction

The causative agents of infectious diseases come in a great variety of shapes, biochemistry, and genetic makeup. They originate from a variety of population reservoirs and only cross their barriers on occasion,Citation1,Citation2 due to the subtle changes in the dynamic ecological equilibrium. In the face of climate changes of as yet unseen proportions, the change in ecological balances will inevitably cause emergence of new infectious diseases.Citation3,Citation4 This makes further pandemics highly plausible. To tackle the outbreaks of the future we must improve our understanding of infectious diseases caused by the interactions between a pathogen and a host. In this review we will explore how novel techniques for computational analysis of interactions between a pathogen and its host may foster such understanding. For the purpose of this review, we will focus primarily on viruses, however some aspects of interactions between viruses and their hosts may be generalized to other pathogens.

Viruses require host cells to procreate and spread their progeny. For this, they enter cells, replicate and egress in a stepwise process occurring through interactions between molecules of the host cell and the pathogen molecules. Such interactions are commonly referred to as host–pathogen interactions (HPI).Citation5 HPIs include mediating various host mechanisms’ exploitation by the pathogen. Typically, this occurs through direct interactions between the molecules of the pathogen and the molecules of the host cell (). For example, SARS-CoV2 virus enters human host cells through the binding of its S protein to Angiotensin-converting Enzyme 2 (ACE-2) of the cells. Next, in brief, particles get endocytosed, fuse with endosomes and uncoat their genome, perform primary protein translation, and viral RNA synthesis. Consecutively, virus forms replication factories, producing progeny building blocks, and assembles progeny, which subsequently egress the infected host-cell.Citation6 Needless to say, on a mechanistic level this process involves dozens of HPIs which give the virus its advantages and may be exploited as antiviral strategies. Additionally, these mechanisms vary greatly for other virusesCitation7 () and even other human coronaviruses (HCoV).

Figure 1 Schematic overview of host–pathogen interactions. (A) simplistic depiction of a pathogen (green hexagon in blue circle) surface protein (red circle) binding to a receptor (black Y-shape) on a host cell surface. (B) A generalized and simplified overview of a pathogen life-cycle stages involving interactions with the host cell.

Figure 1 Schematic overview of host–pathogen interactions. (A) simplistic depiction of a pathogen (green hexagon in blue circle) surface protein (red circle) binding to a receptor (black Y-shape) on a host cell surface. (B) A generalized and simplified overview of a pathogen life-cycle stages involving interactions with the host cell.

For example, S proteins of the predominant seasonal HCoV-OC43 (Betacoronavirus 1) and HCoV-229E (Alphacoronavirus)Citation8,Citation9 bind to 9-O-acetylsialic acids and Amino-peptidase N as primary receptors in human cells rather than ACE-2. Similarly, the primary receptor of the HCoV-HKU1 responsible for the first severe acute respiratory syndrome (SARS) outbreak is 9-O-acetylsialic acids rather than ACE-2.Citation10 In other words, human pathogenic viruses constantly change and adapt. Therefore, understanding biological mechanisms of virus entry, replication, and egress may allow to develop general strategies for fighting infectious diseases. However, identifying HPIs on a mechanistic level is often a laborious manual experimental task. Furthermore, due to the ability of the virus to change and adapt, deciphering HPI mechanisms in a timely fashion is akin to chasing a moving target. In the currently ongoing SARS-CoV2 outbreak, identifying important mechanistic changes in the emerging virus variants fast is the question of life and death. Perhaps the most promising weapon to tackle laborious manual tasks that humanity has in its arsenal to date are machine learning (ML) and artificial intelligence (AI). While experimental validation remains a must, these computational techniques may significantly narrow down the number of experiments required for HPI identification through predicting putative interaction partners. Such HPI prediction tasks include predicting host–pathogen protein–protein interactionsCitation11,Citation12 (e.g., using protein sequences and infection phenotypesCitation13), prediction of a putative host or a receptor for the specific pathogen.Citation14

Historically, the field of AI originates from an attempt to create fundamental and applied basics for machines with “intelligent properties”.Citation15 ML, often considered an AI subfield, has significantly facilitated AI by providing a tangible toolset. Conversely, the traction some ML algorithms like artificial neural networks have gained in recent years may be to a larger extent attributed to the effort to make progress in AI. Taken together, these methodologies refer to a group of computational techniques that enable computers to perform specific tasks without the use of explicit rules or instructions.Citation16 This is usually accomplished by creating ML models from real world or computer simulated data. Typically, an ML model is trained on a data set of engineered features (e.g., extraction of cell size, signal intensity, etc.) in a user-supervised or unsupervised manner. Conventionally, a supervised data set consists of the features (X) and targets (Y), where Y corresponds to an objectively known property (i.e., the ground truth). An approach like this is well suited to automating biological data processing, where formulating a detailed and finite set of rules denoting related events is difficult.Citation17Citation19

Beyond classical ML, a subset of approaches named Deep Learning (DL) uses algorithms like deep artificial neural networks (DNN) to provide a methodology for pattern recognition with unprecedented accuracy.Citation20 This is achieved through a combination of approaches including representation learning, allowing automated feature generation,Citation21 as well as, stacking multiple hidden layers (modules) of artificial neural networks. Combining various kinds of connections and linear algebra operations between individual neurons in a layer allows to construct a broad variety of DNN layers including, for example, convolutional layers,Citation22 recurrent layers,Citation23 attention layers,Citation24 Sigmoid or SoftMax classification layers. Modern DNN architectures reach expressive capacity of billions or even trillionsCitation25 trainable parameters.Citation26 Noteworthy, as a rule of thumb, the larger the expressive capacity of a DNN the more data points are required to train while avoiding overfitting. This bears design choice constraints in the domain specific fields like HPI analysis. To facilitate training of such advanced architectures, Practicality of DL relies on the modern parallel computing, which allows it to represent the features of the input data (e.g., micrographs or CT scans) through non-linear transformations in the so-called vector latent space. Recently, adoption of these methodologies has dramatically increased in the field of Infections Biology. Applications of AI, ML and DL span molecular and genetic data (), microscopy () and language data (). Here we will review specific examples of these applications and discuss the future outlook for ML and AI for the prediction of HPIs.

Figure 2 Overview of machine learning and artificial intelligence application for host–pathogens interactions research. (A) Schematic representation of machine learning applications for genetic and molecular data. (B) Schematic representation of machine learning applications for image data. (C) Schematic representation of machine learning applications for language data. Gray parathesis separate respective downstream tasks.

Figure 2 Overview of machine learning and artificial intelligence application for host–pathogens interactions research. (A) Schematic representation of machine learning applications for genetic and molecular data. (B) Schematic representation of machine learning applications for image data. (C) Schematic representation of machine learning applications for language data. Gray parathesis separate respective downstream tasks.

Host–Pathogen Interactions Analysis from Genetic and Molecular Data

Perhaps one of the most direct applications of AI and ML for HPIs is to reveal patterns on the level of host and pathogen molecules and genes. The genetic or molecular information is typically represented on a single-character sequence level (). Small molecule information is often represented using a simplified molecular-input line-entry system (smiles).Citation27 In such a setting ML can assist in RNA and DNA accessibility analysis, transcription analysis (reviewed inCitation28), protein–protein interactions,Citation11Citation13,Citation29 as well as, sequence-based host organism or receptor prediction.Citation14 Noteworthy, in well-defined tasks simple ML algorithms like Random Forest (RF)Citation30 classification, Multilayer Perceptron (MLP) or kernel-based SVMCitation31 perform remarkably well. For example, Karabulut et al show that on the task of adenovirus infection genus prediction kernel-based SVM reaches performance of 0.96 F1 score and 0.89 area under the receiver operating characteristics curve (AUC) with RF and MLP algorithms trailing remarkably close.Citation29 In such cases, more advanced algorithms like DL are not very likely to deliver a significant further improvement. However, in the settings outside of the very specific data set these algorithms may deliver a boost in generalization.

Identification of genetic variations in either host or pathogen genomes conferring higher pathogenicity may also be improved using DL.Citation28 Other examples of successful ML application for HPIs include base calling and SNP analysisCitation32 and clinical metagenomics.Citation33 Algorithms of choice most commonly include recurrent neural networks (RNN), for example architecture known as long short-term memory (LSTM) RNN.Citation23 In some cases, more specialized CNNs are employedCitation34 or even a consecutive combination of CNN and LSTM.Citation35 Traditionally, RNN (and sometimes CNN) architectures have dominated the algorithmic landscape for sequence-based analysis, demonstrating state-of-the-art performance. However, recently introduced transformer architecture,Citation26 which is currently dominating natural language ML, is slowly entering the field.Citation36 Yet data-hungriness remains the biggest hurdle for the transformers to overcome in the HPI domain.

Beyond simple pattern recognition AI and ML algorithms trained on a large number of examples may be used to generate putative new molecules. This approach is showing promising results in the novel antiviral space. Beck and colleagues, for example, described prospective antivirals for the pandemic SARS-CoV2 using a DL model of drug-target interaction from a large number of commercially available antivirals. For this, they developed the Molecule Transformer - Drug Target Interaction architecture. Using this technique, authors identified human immunodeficiency virus drug as a potential candidate.Citation37

Host–Pathogen Interactions Analysis from Image Data

HPIs may be observed visually or using digital microscopy. On a subcellular level microscopic imaging may be employed to capture image-based data of individual virus particle interactions with host-cell proteins.Citation38,Citation39 These data are typically obtained using fluorescence light microscopy at high magnification (e.g., 64x-100x), supperresolution microscopyCitation38,Citation40,Citation41 or electron microscopy.Citation42Citation44 Various ML techniques may be employed to analyze HPIs subcellular data, ranging from support vector machinesCitation45 to DL.Citation46Citation50

Infection manifestation on a single-cell level is typically hallmarked by the onset of cytopathic effect (CPE).Citation51,Citation52 Synchronized with virus entry, uncoating, and replication through a virus genetic program, virus-induced CPE involves dramatic changes of cell morphology,Citation51,Citation52 which can be observed in cell culture using conventional light microscopy. These include, among other, cell rounding and swelling, focal patterns emergence, cytoplasmic vacuolizationCitation53 pyknosis (cytoplasmic shrinkage),Citation51 syncytia formation,Citation54 and may be linked with pathogen-related cell death,Citation51 apoptosisCitation55 or motility.Citation56 The HPIs occurring on molecular level drive CPE. CPE can be observed in cell culture using conventional light microscopy techniques without specific labeling at a moderate magnification (e.g., 5×-20×). Its manifestation often differs substantially for various pathogens, cell types, and multiplicity of infection (MOI).Citation57

Downstream tasks () for which ML is employed on such data typically include pathogen image segmentationCitation49 (often using a variety of the U-Net architectureCitation58), HPI events or virus classification from full or cropped field-of-view,Citation47,Citation48,Citation50 pathogen object detectionCitation59 or infection detection.Citation60 Other examples include understanding structure and function relationships with the pathogensCitation40 or time-lapse analysis.Citation45,Citation60

Being strongly related to the field of computer vision, HPI image analysis remains, thus far, strongly dominated by the CNN DL algorithms. Indeed, unsurprisingly for pathogen image classification tasks both shallower and deeper CNNs outperform conventional RF and MLP ML algorithms in metrics like F1 often by more than 30–40%.Citation47,Citation60

Host–Pathogen Interactions Analysis from the Language Data in the Scientific Publications

ML for natural language processing (NLP) has recently seen an incredible boost in performance through the introduction of the transformer-based models.Citation26 Specifically, in their work Devlin et al reported that BERT transformer mode significantly outperformed bidirectional LSTM (state-of-the-art at that time) on the General Language Understanding Evaluation (GLUE)Citation61 benchmark with 71 and 82 average GLUE performance, Transformer models leverage large text corpora, akin to BookCorpusCitation62 or the English Wikipedia data set, and high expressive capacity to define the new state-of-the-art performance on a plethora of NLP tasks. These tasks include text classification, named entity recognition (NER), semantic text similarity (STS), text summary, question answering (QA), reading comprehension, knowledge discovery (KD) and mapping and other (reviewed inCitation63). Further boost in performance in the novel transformer architectures is achieved through the multi-headed attention mechanism.Citation24 Building upon it through transfer learning (i.e., repurposing a pretrained model), the general-purpose deep bidirectional transformers model was fine-tuned by Lee and colleagues to the domain of biomedical research texts.Citation64 This work, in turn, sparked a surge in biomedical NLP research.

With respect to HPI research, adoption of this novel methodology was limited by the availability of data sets large-enough to warrant a successful domain adaptation until a few years ago. However, amid the global SARS-CoV2 pandemic Wang and colleagues constructed the so-called COVID-19 open research data set (CORD-19).Citation65 This data set, in turn, inspired a plethora of HPI analysis approaches using the language data in scientific publications. A great number of NLP downstream tasks have been attempted on this and similar data sets in the past 12–18 month alone ().

Specifically, Koksal et al proposed a search engine approach to finding protein-compound pairs in COVID-19 literature formulated in the CORD-19 corpus.Citation66 Wang and co-authors have formulated a NER task data set that covers 75 detailed entity types.Citation67 These types include biomedical entities like genes, chemicals and diseases, as well as, entity types related explicitly to the SARS-CoV2 and COVID-19 research including coronaviruses, viral proteins, materials, evolution, immune responses and substrates. To capture subtle domain specific similarities in the CORD-19 data set Guo et al formulated an STS data set.Citation68 For the KD task on CORD-19 data set Tam and co-authors proposed a transformer-based target-query method they named Transformer Query-Target Knowledge Discovery (TEND).Citation69 Reddy and colleagues proposed a QA task domain adaptation for COVID-19 related questionsCitation70 using a combination of CORD-19 and several COVID-19 QA data sets.Citation71Citation73 Their data suggest that pretrained models may successfully be fine-tuned for HPI domain, gaining 4–7% performance over the baseline. Noteworthy, many of these recent approaches remain to be peer-reviewed and the impact these techniques will have on the SARS-CoV2 and HPI research remains to be demonstrated.

Conclusion

Prior to the surge of AI-community interest in the biomedical domain, ML techniques have been only sporadically applied in analysis of interactions between pathogens and their hosts (reviewed inCitation74). However, in the wake of SARS-CoV2 pandemic ML and AI techniques application for HPI analysis sees an ever-growing interest. Transcending the status of a niche application of novel AI algorithms, pathogens and HPI research found itself in the spotlight of AI-researchers attention. This manifested in a variety of fundamentally new approaches to analyze molecular, image-based and more recently language-based HPI data mentioned in this review. Admittedly, the knowledge gap between AI and HPI research remains great. This is especially evident in the case of language-based data. Yet, given the sheer effort from both fields to bridge this gap, we may see fruition of these efforts in a not-so-distant future.

The lack of large HPI data sets remains one of the main hurdles for further penetration of AI and ML into the HPI field. However, recent years have seen a significant improvement in that matter. In case of image-based data, the advent of resources like Bio-Image Archive,Citation75 data-dedicated journals and social coding platforms fostered deposition of specialized image data sets, including those focused on HPI.Citation46,Citation76 The influence of CORD-19Citation65 on the adoption of NLP techniques in HPI research was unequivocally self-evident. Perhaps, it is the access to open research data sets akin to CORD-19 together with image-based data repositories that will become the missing piece for AI and ML to become a viral case in HPI research.

Abbreviations

ACE-2, angiotensin-converting enzyme 2; AI, artificial intelligence; AUC, area under receiver operating characteristics curve; COVID-19, coronavirus disease 2019; CORD-19, COVID-19 open research data set; CT, computed tomography; DL, deep learning; DNN, deep artificial neural networks; EM, electron microscopy; GLUE, General Language Understanding Evaluation; HCoV, human coronaviruses; HPI, host–pathogen interactions; KD, knowledge discovery; ML, machine learning; NLP, natural language processing; NER, named entity recognition; SARS-CoV2, severe acute respiratory syndrome coronavirus 2; STS, semantic text similarity; TEND, transformer query-target knowledge discovery; QA, question answering.

Disclosure

The author reports no conflicts of interest in this work.

References

  • ShortridgeK. Pandemic influenza: a zoonosis?Semin Respir Infect. 1992;7:11–25.1609163
  • HahnBH, ShawGM, DeKM, SharpPM. AIDS as a zoonosis: scientific and public health implications. Science. 2000;287(5453):607–614. doi:10.1126/science.287.5453.60710649986
  • PatzJA, GraczykTK, GellerN, VittorAY. Effects of environmental change on emerging parasitic diseases. Int J Parasitol. 2000;30(12–13):1395–1405. doi:10.1016/S0020-7519(00)00141-711113264
  • PatzJA, ReisenWK. Immunology, climate change and vector-borne diseases. Trends Immunol. 2001;22(4):171–172. doi:10.1016/S1471-4906(01)01867-111274908
  • CasadevallA, PirofskiLA. Host-pathogen interactions: basic concepts of microbial commensalism, colonization, infection, and disease. Infect Immun. 2000;68(12):6511–6518. doi:10.1128/IAI.68.12.6511-6518.200011083759
  • V’kovskiP, KratzelA, SteinerS, StalderH, ThielV. Coronavirus biology and replication: implications for SARS-CoV-2. Nat Rev Microbiol. 2020;19:155–170.33116300
  • YamauchiY, HeleniusA. Virus entry at a glance. J Cell Sci. 2013;126(6):1289–1295.23641066
  • LauSK, LeeP, TsangAK, et al. Molecular epidemiology of human coronavirus OC43 reveals evolution of different genotypes over time and recent emergence of a novel genotype due to natural recombination. J Virol. 2011;85(21):11325–11337. doi:10.1128/JVI.05512-1121849456
  • GauntER, HardieA, ClaasEC, SimmondsP, TempletonKE. Epidemiology and clinical presentations of the four human coronaviruses 229E, HKU1, NL63, and OC43 detected over 3 years using a novel multiplex real-time PCR method. J Clin Microbiol. 2010;48(8):2940–2947. doi:10.1128/JCM.00636-1020554810
  • GuruprasadL. Human coronavirus spike protein-host receptor recognition. Prog Biophys Mol Biol. 2020.
  • ZhengN, WangK, ZhanW, DengL. Targeting virus-host protein interactions: feature extraction and machine learning approaches. Curr Drug Metab. 2019;20(3):177–184. doi:10.2174/138920021966618082912103830156155
  • Cuesta-AstrozY, OliveiraG. Computational and experimental approaches to predict host–parasite protein–protein interactions. In: Computational Cell Biology. New York, NY: Humana Press; 2018:153–173.
  • Liu-WeiW, KafkasS, ChenJ, DimonacoNJ, TegnérJ, HoehndorfR. DeepViral: prediction of novel virus-host interactions from protein sequences and infectious disease phenotypes. 2021.
  • MockF, ViehwegerA, BarthE, MarzM. VIDHOP, viral host prediction with Deep Learning. Bioinformatics. 2021;37(3):318–325. doi:10.1093/bioinformatics/btaa70532777818
  • FischDH, YakimovichA, CloughB, et al. An Artificial Intelligence Workflow for Defining Host-Pathogen Interactions. bioRxiv. 2018:408450.
  • MitchellTM. Machine learning. 1997.
  • TarcaAL, CareyVJ, ChenXW, RomeroR, DrăghiciS. Machine learning and its applications to biology. PLoS Comput Biol. 2007;3(6):e116. doi:10.1371/journal.pcbi.003011617604446
  • SommerC, GerlichDW. Machine learning in cell biology–teaching computers to recognize phenotypes. J Cell Sci. 2013;126(24):5529–5539.24259662
  • SommerC, StraehleC, KotheU, HamprechtFA. ilastik: interactive learning and segmentation toolkit. Paper presented at: 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro; 2011; IEEE.
  • LeCunY, BengioY, HintonG. Deep learning. nature. 2015;521(7553):436. doi:10.1038/nature1453926017442
  • BengioY, CourvilleA, VincentP. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35(8):1798–1828. doi:10.1109/TPAMI.2013.5023787338
  • LeCunY, BengioY. Convolutional networks for images, speech, and time series. In: ArbibMA, editor. The Handbook of Brain Theory and Neural Networks. MIT Press; 1995:3361.
  • HochreiterS, SchmidhuberJ. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi:10.1162/neco.1997.9.8.17359377276
  • VaswaniA, ShazeerN, ParmarN, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017;30:5998–6008.
  • FedusW, ZophB, ShazeerN. Switch transformers: scaling to trillion parameter models with simple and efficient sparsity. arXiv Preprint arXiv:210103961. 2021.
  • DevlinJ, ChangMW, LeeK, ToutanovaK. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv Preprint arXiv:181004805. 2018.
  • WeiningerD. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci. 1988;28(1):31–36.
  • ZouJ, HussM, AbidA, MohammadiP, TorkamaniA, TelentiA. A primer on deep learning in genomics. Nat Genet. 2019;51(1):12–18. doi:10.1038/s41588-018-0295-530478442
  • KarabulutOC, KarpuzcuBA, TürkE, IbrahimAH, SüzekBE. ML-AdVInfect: a machine-learning based adenoviral infection predictor. Front Mol Biosci. 2021;8. doi:10.3389/fmolb.2021.647424
  • HoTK. Random decision forests. Paper presented at: Proceedings of 3rd International Conference on Document Analysis and Recognition; 1995.
  • CrammerK, SingerY. On the algorithmic implementation of multiclass kernel-based vector machines. J Mach Learn Res. 2001;2:265–292.
  • PoplinR, ChangP-C, AlexanderD, et al. A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018;36(10):983–987. doi:10.1038/nbt.423530247488
  • ChiuCY, MillerSA. Clinical metagenomics. Nat Rev Genet. 2019;20(6):341. doi:10.1038/s41576-019-0113-730918369
  • TampuuA, BzhalavaZ, DillnerJ, VicenteVR, MelcherU. ViraMiner: deep learning on raw DNA sequences for identifying viral genomes in human samples. PLoS One. 2019;14(9):e0222271. doi:10.1371/journal.pone.022227131509583
  • VeltriD, KamathU, ShehuA. Deep learning improves antimicrobial peptide recognition. Bioinformatics. 2018;34(16):2740–2747. doi:10.1093/bioinformatics/bty17929590297
  • ZhangY, LinJ, ZhaoL, ZengX, LiuX. A novel antibacterial peptide recognition algorithm based on BERT. Brief Bioinform. 2021.
  • BeckBR, ShinB, ChoiY, ParkS, KangK. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput Struct Biotechnol J. 2020;18:784–790. doi:10.1016/j.csbj.2020.03.02532280433
  • GrayRD, AlbrechtD, BeerliC, et al. Nanoscale polarization of the entry fusion complex of vaccinia virus drives efficient fusion. Nat Microbiol. 2019;4(10):1636–1644. doi:10.1038/s41564-019-0488-431285583
  • WangI, BurckhardtCJ, YakimovichA, GreberUF. Imaging, tracking and computational analyses of virus entry and egress with the cytoskeleton. Viruses. 2018;10(4):166. doi:10.3390/v10040166
  • GrayRD, BeerliC, PereiraPM, et al. VirusMapper: open-source nanoscale mapping of viral architecture through super-resolution microscopy. Sci Rep. 2016;6:29132. doi:10.1038/srep2913227374400
  • YakimovichA, HuttunenM, SamolejJ, et al. Mimicry embedding for advanced neural network training of 3D biomedical micrographs. bioRxiv. 2019:820076.
  • DalesS, SiminovitchL. The development of vaccinia virus in Earle’s L strain cells as examined by electron microscopy. J Biophys Biochem Cytol. 1961;10(4):475–503. doi:10.1083/jcb.10.4.47513719413
  • DalesS, EggersHJ, TammI, PaladeGE. Electron microscopic study of the formation of poliovirus. Virology. 1965;26:379–389. doi:10.1016/0042-6822(65)90001-214319710
  • NiiS, MorganC, RoseHM. Electron microscopy of herpes simplex virus: II. Sequence of development. J Virol. 1968;2(5):517–536. doi:10.1128/jvi.2.5.517-536.19684301317
  • WangI-H, BurckhardtCJ, YakimovichA, MorfMK, GreberUF. The nuclear export factor CRM1 controls juxta-nuclear microtubule-dependent virus transport. J Cell Sci. 2017;130(13):2185–2195.28515232
  • GeorgiF, KuttlerF, MurerL, et al. A high-content image-based drug screen of clinical compounds against cell transmission of adenovirus. Scientific Data. 2020;7(1):1–12. doi:10.1038/s41597-020-00604-031896794
  • FischD, YakimovichA, CloughB, et al. Defining host–pathogen interactions employing an artificial intelligence workflow. eLife. 2019;8:e40560. doi:10.7554/eLife.4056030744806
  • NanniL, De LucaE, FacinML, MaguoloG. Deep learning and handcrafted features for virus image classification. J Imag. 2020;6(12):143. doi:10.3390/jimaging6120143
  • MatuszewskiDJ, SintornI-M. Reducing the u-net size for practical scenarios: virus recognition in electron microscopy images. Comput Methods Programs Biomed. 2019;178:31–39. doi:10.1016/j.cmpb.2019.05.02631416558
  • ZhangL, YanWQ. Deep learning methods for virus identification from digital images. Paper presented at: 2020 35th International Conference on Image and Vision Computing New Zealand (IVCNZ); 2020.
  • AgolVI. Cytopathic effects: virus-modulated manifestations of innate immunity?Trends Microbiol. 2012;20(12):570–576. doi:10.1016/j.tim.2012.09.00323072900
  • MocarskiES, UptonJW, KaiserWJ. Viral infection and the evolution of caspase 8-regulated apoptotic and necrotic death pathways. Nat Rev Immunol. 2012;12(2):79. doi:10.1038/nri3131
  • ShubinAV, DemidyukIV, KomissarovAA, RafievaLM, KostrovSV. Cytoplasmic vacuolization in cell death and survival. Oncotarget. 2016;7(34):55863. doi:10.18632/oncotarget.1015027331412
  • SuchmanE, BlairCCytopathic effects of viruses protocols. 2007.
  • ShenY, ShenkTE. Viruses and apoptosis. Curr Opin Genet Dev. 1995;5(1):105–111. doi:10.1016/S0959-437X(95)90061-67749317
  • BeerliC, YakimovichA, KilcherS, et al. Vaccinia virus hijacks EGFR signalling to enhance virus spread through rapid and directed infected cell motility. Nat Microbiol. 2018;4:216–225.30420785
  • González-SánchezH, Monsiváis-UrendaA, Salazar-AldreteC, et al. Effects of cytomegalovirus infection in human neural precursor cells depend on their differentiation state. J Neurovirol. 2015;21(4):346–357. doi:10.1007/s13365-015-0315-525851778
  • RonnebergerO, FischerP, BroxTU-net: convolutional networks for biomedical image segmentation. Paper presented at: International Conference on Medical Image Computing and Computer-Assisted Intervention; 2015; Cham.
  • ItoE, SatoT, SanoD, UtagawaE, KatoT. Virus particle detection by convolutional neural network in transmission electron microscopy images. Food Environ Virol. 2018;10(2):201–208. doi:10.1007/s12560-018-9335-729352405
  • AndriasyanV, YakimovichA, PetkidisA, et al. Microscopy deep learning predicts virus infections and reveals mechanics of lytic-infected cells. Iscience. 2021;24(6):102543. doi:10.1016/j.isci.2021.10254334151222
  • WangW, YanM, WuC. Multi-granularity hierarchical attention fusion networks for reading comprehension and question answering. arXiv Preprint arXiv:181111934. 2018.
  • ZhuY, KirosR, ZemelR, et al. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. Paper presented at: Proceedings of the IEEE International Conference on Computer Vision; 2015.
  • GilliozA, CasasJ, MugelliniE, Abou KhaledO. Overview of the Transformer-based Models for NLP Tasks. Paper presented at: 2020 15th Conference on Computer Science and Information Systems (FedCSIS); 2020.
  • LeeJ, YoonW, KimS, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234–1240.31501885
  • WangLL, LoK, ChandrasekharY, et al. CORD-19: the covid-19 open research dataset. ArXiv. 2020.
  • KöksalA, DönmezH, ÖzçelikR, OzkirimliE, ÖzgürA. Vapur: a search engine to find related protein–compound pairs in COVID-19 literature. arXiv Preprint arXiv:200902526. 2020.
  • WangX, SongX, LiB, GuanY, HanJ. Comprehensive named entity recognition on cord-19 with distant or weak supervision. arXiv Preprint arXiv:200312218. 2020.
  • GuoX, MirzaalianH, SabirE, JaiswalA, Abd-AlmageedW. Cord19sts: covid-19 semantic textual similarity dataset. arXiv Preprint arXiv:200702461. 2020.
  • TamLK, WangX, XuD. Transformer query-target knowledge discovery (TEND): drug discovery from CORD-19. arXiv Preprint arXiv:201204682. 2020.
  • ReddyRG, IyerB, SultanMA, et al. End-to-end QA on COVID-19: domain adaptation with synthetic training. arXiv Preprint arXiv:201201414. 2020.
  • MöllerT, ReinaA, JayakumarR, PietschM. COVID-QA: a question answering dataset for COVID-19. Paper presented at: Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020; 2020.
  • TangR, NogueiraR, ZhangE, et al. Rapidly bootstrapping a question answering dataset for COVID-19. arXiv Preprint arXiv:200411339. 2020.
  • LeeJ, YiSS, JeongM, et al. Answering questions on covid-19 in real-time. arXiv Preprint arXiv:200615830. 2020.
  • SenR, NayakL, DeRK. A review on host–pathogen interactions: classification and prediction. Eur J Clin Microbiol Infect Dis. 2016;35(10):1581–1599. doi:10.1007/s10096-016-2716-727470504
  • EllenbergJ, SwedlowJR, BarlowM, et al. A call for public archives for biological image data. Nat Methods. 2018;15(11):849–854. doi:10.1038/s41592-018-0195-830377375
  • YakimovichA, HuttunenM, SamolejJ, et al. Mimicry embedding facilitates advanced neural network training for image-based pathogen detection. Msphere. 2020;5(5):e00836–20. doi:10.1128/mSphere.00836-2032907956