996
Views
10
CrossRef citations to date
0
Altmetric
Applications and Case Studies

Modeling Between-Study Heterogeneity for Improved Replicability in Gene Signature Selection and Clinical Prediction

, , &
Pages 1125-1138 | Received 29 Aug 2017, Accepted 17 Sep 2019, Published online: 29 Oct 2019
 

ABSTRACT

In the genomic era, the identification of gene signatures associated with disease is of significant interest. Such signatures are often used to predict clinical outcomes in new patients and aid clinical decision-making. However, recent studies have shown that gene signatures are often not replicable. This occurrence has practical implications regarding the generalizability and clinical applicability of such signatures. To improve replicability, we introduce a novel approach to select gene signatures from multiple datasets whose effects are consistently nonzero and account for between-study heterogeneity. We build our model upon some rank-based quantities, facilitating integration over different genomic datasets. A high-dimensional penalized generalized linear mixed model is used to select gene signatures and address data heterogeneity. We compare our method to some commonly used strategies that select gene signatures ignoring between-study heterogeneity. We provide asymptotic results justifying the performance of our method and demonstrate its advantage in the presence of heterogeneity through thorough simulation studies. Lastly, we motivate our method through a case study subtyping pancreatic cancer patients from four gene expression studies. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.

Supplementary Materials

The online supplementary materials contain details on the alternative TSP screening procedure described in Section 7, supplementary figures pertaining to Section 7, and additional proofs pertaining to Section 5. R and Rcpp Code is also provided to perform the simulations and real data analysis.

Additional information

Funding

This work was supported by National Cancer Institute (under grant numbers R01-CA199064, P01-CA142538, and R01-CA193650) and National Institute of General Medical Sciences (under grant numbers R01-GM070335 and R01-GM105785).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.