Promoting Similarity of Sparsity Structures in Integrative Analysis With Penalization

Yuan HuangDepartment of Biostatistics, Yale University, New Haven, CT

Qingzhao ZhangDepartment of Mathematics, University of Chinese Academy of Sciences, Beijing, China

Sanguo ZhangDepartment of Mathematics, University of Chinese Academy of Sciences, Beijing, China

Jian HuangDepartment of Statistics and Actuarial Science, Iowa University, Iowa City, IA

Shuangge MaDepartment of Biostatistics, Yale University, New Haven, CT

ABSTRACT

For data with high-dimensional covariates but small sample sizes, the analysis of single datasets often generates unsatisfactory results. The integrative analysis of multiple independent datasets provides an effective way of pooling information and outperforms single-dataset and several alternative multi-datasets methods. Under many scenarios, multiple datasets are expected to share common important covariates, that is, the corresponding models have similarity in their sparsity structures. However, the existing methods do not have a mechanism to promote the similarity in sparsity structures in integrative analysis. In this study, we consider penalized variable selection and estimation in integrative analysis. We develop an L₀-penalty-based method, which explicitly promotes the similarity in sparsity structures. Computationally it is realized using a coordinate descent algorithm. Theoretically it has the selection and estimation consistency properties. Under a wide spectrum of simulation scenarios, it has identification and estimation performance comparable to or better than the alternatives. In the analysis of three lung cancer datasets with gene expression measurements, it identifies genes with sound biological implications and satisfactory prediction performance. Supplementary materials for this article are available online.

KEYWORDS:

Supplementary Materials

This file contains (S1) proofs for the theoretical results described in Section 3.2, (S2) additional numerical results, and (S3) details on estimation under the accelerated failure time model for right censored data.

Funding

This work was supported by CA142774 and CA016359 from NIH, 13CTJ001 and 13&ZD148 from National Social Science Foundation of China, and the VA Cooperative Studies Program of the Department of Veterans Affairs, Office of Research and Development. Yuan Huang and Qingzhao Zhang contributed equally.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Promoting Similarity of Sparsity Structures in Integrative Analysis With Penalization

Supplementary Materials

Related Research Data

Information for

Open access

Opportunities

Help and information

Promoting Similarity of Sparsity Structures in Integrative Analysis With Penalization

ABSTRACT

Supplementary Materials

Funding

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature