795
Views
13
CrossRef citations to date
0
Altmetric
Dimensional Data

Sequential Co-Sparse Factor Regression

, &
Pages 814-825 | Received 01 Jun 2016, Published online: 16 Oct 2017
 

ABSTRACT

In multivariate regression models, a sparse singular value decomposition of the regression component matrix is appealing for reducing dimensionality and facilitating interpretation. However, the recovery of such a decomposition remains very challenging, largely due to the simultaneous presence of orthogonality constraints and co-sparsity regularization. By delving into the underlying statistical data-generation mechanism, we reformulate the problem as a supervised co-sparse factor analysis, and develop an efficient computational procedure, named sequential factor extraction via co-sparse unit-rank estimation (SeCURE), that completely bypasses the orthogonality requirements. At each step, the problem reduces to a sparse multivariate regression with a unit-rank constraint. Nicely, each sequentially extracted sparse and unit-rank coefficient matrix automatically leads to co-sparsity in its pair of singular vectors. Each latent factor is thus a sparse linear combination of the predictors and may influence only a subset of responses. The proposed algorithm is guaranteed to converge, and it ensures efficient computation even with incomplete data and/or when enforcing exact orthogonality is desired. Our estimators enjoy the oracle properties asymptotically; a non-asymptotic error bound further reveals some interesting finite-sample behaviors of the estimators. The efficacy of SeCURE is demonstrated by simulation studies and two applications in genetics. Supplementary materials for this article are available online.

Acknowledgments

Chen's research was partially supported by the National Science Foundation grant DMS-1613295 and the National Institutes of Health (NIH) grant U01-HL114494. The authors are grateful to the Editor, the Associate Editor, and the two referees for their valuable comments and suggestions, which have led to significant improvement of the article.

Supplementary Materials

The online supplementary materials include additional simulation results, a biclustering example using gene expression data, more results in the yeast cycle data analysis, details on handling incomplete data and exact orthogonality, and all the technical proofs. Implementations of the proposed methods are available in the R package secure (R Development Core Team Citation2017), which can be accessed at https://CRAN.R-project.org/package=secure.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.