ABSTRACT
In multivariate regression models, a sparse singular value decomposition of the regression component matrix is appealing for reducing dimensionality and facilitating interpretation. However, the recovery of such a decomposition remains very challenging, largely due to the simultaneous presence of orthogonality constraints and co-sparsity regularization. By delving into the underlying statistical data-generation mechanism, we reformulate the problem as a supervised co-sparse factor analysis, and develop an efficient computational procedure, named sequential factor extraction via co-sparse unit-rank estimation (SeCURE), that completely bypasses the orthogonality requirements. At each step, the problem reduces to a sparse multivariate regression with a unit-rank constraint. Nicely, each sequentially extracted sparse and unit-rank coefficient matrix automatically leads to co-sparsity in its pair of singular vectors. Each latent factor is thus a sparse linear combination of the predictors and may influence only a subset of responses. The proposed algorithm is guaranteed to converge, and it ensures efficient computation even with incomplete data and/or when enforcing exact orthogonality is desired. Our estimators enjoy the oracle properties asymptotically; a non-asymptotic error bound further reveals some interesting finite-sample behaviors of the estimators. The efficacy of SeCURE is demonstrated by simulation studies and two applications in genetics. Supplementary materials for this article are available online.
Acknowledgments
Chen's research was partially supported by the National Science Foundation grant DMS-1613295 and the National Institutes of Health (NIH) grant U01-HL114494. The authors are grateful to the Editor, the Associate Editor, and the two referees for their valuable comments and suggestions, which have led to significant improvement of the article.
Supplementary Materials
The online supplementary materials include additional simulation results, a biclustering example using gene expression data, more results in the yeast cycle data analysis, details on handling incomplete data and exact orthogonality, and all the technical proofs. Implementations of the proposed methods are available in the R package secure (R Development Core Team Citation2017), which can be accessed at https://CRAN.R-project.org/package=secure.