Abstract
Perturbation of transcriptome in viral infection patients is a recurrent theme impacting symptoms and mortality, yet a detailed understanding of pertinent transcriptome and identification of robust biomarkers is not complete. In this study, we manually collected 23 datasets related to 6,197 blood transcriptomes across 16 types of respiratory virus infections. We applied a comprehensive systems biology approach starting with whole-blood transcriptomes combined with multilevel bioinformatics analyses to characterize the expression, functional pathways, and protein-protein interaction (PPI) networks to identify robust biomarkers and disease comorbidities. Robust gene markers of infection with different viruses were identified, which can accurately classify the normal and infected patients in train and validation cohorts. The biological processes (BP) of different viruses showed great similarity and enriched in infection and immune response pathways. Network-based analyses revealed that a variety of viral infections were associated with nervous system diseases, neoplasms and metabolic diseases, and significantly correlated with brain tissues. In summary, our manually collected transcriptomes and comprehensive analyses reveal key molecular markers and disease comorbidities in the process of viral infection, which could provide a valuable theoretical basis for the prevention of subsequent public health events for respiratory virus infections.
Authors’ contributions
Yongsheng Li, Xia Li and Yunpeng Zhang came up with the design, conception and methodology. Yongsheng Li, Xia Li, Yunpeng Zhang and Jing Guo wrote the first draft of the manuscript. Jing Guo, Ya Zhang, Yueying Gao and Si Li contributed to material preparation, data collection, analysis and administration. Ya Zhang, Gang Xu, Zhanyu Tian and Qi Xu contributed to visualization, data curation, validation and research summary. Jing Guo and Ya Zhang contributed equally to this study and shared co-first authors. All authors read and approved the final manuscript.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data availability statement
Public gene expression profiles used in this work can be acquired from Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (https://www.ebi.ac.uk/biostudies/arrayexpress), European Genome-phenome Archive (EGA, https://ega-archive.org/) and BioSample (http://www.ncbi.nlm.nih.gov/biosample/). The datasets used and/or analyzed during the present study are available from the corresponding authors upon reasonable request.