Abstract
Inferring and characterizing gene co-expression networks has led to important insights on the molecular mechanisms of complex diseases. Most co-expression analyses to date have been performed on gene expression data collected from bulk tissues with different cell type compositions across samples. As a result, the co-expression estimates only offer an aggregated view of the underlying gene regulations and can be confounded by heterogeneity in cell type compositions, failing to reveal gene coordination that may be distinct across different cell types. In this article, we introduce a flexible framework for estimating cell-type-specific gene co-expression networks from bulk sample data, without making specific assumptions on the distributions of gene expression profiles in different cell types. We develop a novel sparse least squares estimator, referred to as CSNet, that is efficient to implement and has good theoretical properties. Using CSNet, we analyzed the bulk gene expression data from a cohort study on Alzheimer’s disease and identified previously unknown cell-type-specific co-expressions among Alzheimer’s disease risk genes, suggesting cell-type-specific disease mechanisms. Supplementary materials for this article are available online.
Acknowledgments
We thank the ROSMAP project for their permission, requested at https://www.radc.rush.edu, to access the bulk and single nucleus RNA-seq data in the project. We are grateful to the Editor, the AE and three anonymous referees for their insightful comments that have substantially improved the quality, the presentation, and the reproducibility of the manuscript. We also thank Dr. Jiawei Wang at Yale University for helpful discussions on real data analysis.
Disclosure Statement
The authors report there are no competing interests to declare.