2,880
Views
6
CrossRef citations to date
0
Altmetric
Applications and Case Studies Discussion

Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data

, &
Pages 457-470 | Received 03 Sep 2019, Accepted 29 Nov 2020, Published online: 22 Mar 2021
 

Abstract

Scientists often embed cells into a lower-dimensional space when studying single-cell RNA-seq data for improved downstream analyses such as developmental trajectory analyses, but the statistical properties of such nonlinear embedding methods are often not well understood. In this article, we develop the exponential-family SVD (eSVD), a nonlinear embedding method for both cells and genes jointly with respect to a random dot product model using exponential-family distributions. Our estimator uses alternating minimization, which enables us to have a computationally efficient method, prove the identifiability conditions and consistency of our method, and provide statistically principled procedures to tune our method. All these qualities help advance the single-cell embedding literature, and we provide extensive simulations to demonstrate that the eSVD is competitive compared to other embedding methods. We apply the eSVD via Gaussian distributions where the standard deviations are proportional to the means to analyze a single-cell dataset of oligodendrocytes in mouse brains. Using the eSVD estimated embedding, we then investigate the cell developmental trajectories of the oligodendrocytes. While previous results are not able to distinguish the trajectories among the mature oligodendrocyte cell types, our diagnostics and results demonstrate there are two major developmental trajectories that diverge at mature oligodendrocytes. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplementary materials.

This article is part of the following collections:
Applications and Case Studies Discussion Articles

Supplementary Materials

The following describe the sections in the supplementary materials. In Appendix A, we discuss the publicly-available code and data for reproducibility. In Appendix B, we discuss additional details of the eSVD, including its initialization, tuning procedure, usage for various exponential-family distributions and high-level comparisons to other methods. In Appendix C, we formally describe the analysis pipelines we used when analyzing the oligodendrocytes in Sections 2 and 7. In Appendix D and E, we formalize the statistical theory for estimating the matrix of natural parameters and specialize our theory for the curved Gaussian distribution respectively, alluded to in Section 5. In Appendix F, we describe additional simulation setups and results, extending those described in Section 6. In Appendix G, we describe our modifications to Slingshot and our method for constructing the uncertainty tube in detail. In Appendix H, we describe our additional analysis results of the oligodendrocytes from Section 7, including results of highly informative genes and empirical results when using other methods to analyze this dataset, as well as a clustering analysis of a second single-cell dataset. Appendix I and J contains the proofs for all our theoretical results.

Additional information

Funding

This work was supported by National Science Foundation (NSF) grants DMS-2015492 and DMS-1553884, National Institute of Mental Health (NIMH) grant R37MH057881 and R01MH123184, and Simons Foundation Autism Research Initiative (SFARI) grants SF402281 and SF367561.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.