1,515
Views
0
CrossRef citations to date
0
Altmetric
Articles

Variable screening in multivariate linear regression with high-dimensional covariates

, &
Pages 241-253 | Received 01 Jan 2021, Accepted 09 Sep 2021, Published online: 06 Oct 2021

References

  • Anderson, T. (2003). An introduction to statistical multivariate analysis (3rd ed.). Wiley.
  • Bickel, P. J., & Levina, E. (2008). Regularized estimation of large covariance matrices. The Annals of Statistics, 36(1), 199–227. https://doi.org/10.1214/009053607000000758
  • Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37(4), 373–384. https://doi.org/10.1080/00401706.1995.10484371
  • Breiman, L., & Friedman, J. H. (1997). Predicting multivariate responses in multiple linear regression. Journal of the Royal Statistical Society: Series B, 59(1), 3–54. https://doi.org/10.1111/rssb.1997.59.issue-1
  • Cai, T., Li, H., Liu, W., & Xie, J. (2013). Covariate–adjusted precision matrix estimation with an application in genetical genomics. Biometrika, 100(1), 139–156. https://doi.org/10.1093/biomet/ass058
  • Cai, T., Liu, W., & Luo, X. (2011). A constrained l1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494), 594–607. https://doi.org/10.1198/jasa.2011.tm10155
  • Cai, T., & Lv, J. (2007). Discussion: The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 35(6), 2365–2369. https://doi.org/10.1214/009053607000000442
  • Candes, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. Annals of Statistics, 35(6), 2313–2351. https://doi.org/10.1214/009053606000001523
  • Chen, L., & Huang, J. Z. (2012). Sparse reduced-rank regression for simultaneous dimension reduction and variable selection. Journal of the American Statistical Association, 107(500), 1533–1545. https://doi.org/10.1080/01621459.2012.734178
  • Cooper, C. (1997). The crippling consequences of fractures and their impact on quality of life. American Journal of Medicine, 103(2), 12–19. https://doi.org/10.1016/S0002-9343(97)90022-X
  • Deshpande, S., Rockova, V., & George, E. (2019). Simultaneous variable and covariance selection with the multivariate Spike- and Slab lasso. Journal of Computational and Graphical Statistics, 28(4), 921–931. https://doi.org/10.1080/10618600.2019.1593179
  • Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499. https://doi.org/10.1214/009053604000000067
  • Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
  • Fan, J., & Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space (with discussion). Journal of the Royal Statistical Society: Series B, 70(5), 849–911. https://doi.org/10.1111/rssb.2008.70.issue-5
  • Fan, J., & Peng, H. (2004). Non-concave penalized likelihood with a diverging number of parameters. Annals of Statistics, 32(3), 928–961. https://doi.org/10.1214/009053604000000256
  • Fang, K.-T., Kotz, S., & Ng, K. W. (2018). Symmetric multivariate and related distributions. Chapman and Hall/CRC.
  • Ferte, C., Trister, A. D., Erich, H., & Bot, B. (2013). Impact of bioinformatic procedures in the development and translation of high-throughput molecular classifiers in oncology. Clinical Cancer Research, 19(16), 4315–4325. https://doi.org/10.1158/1078-0432.CCR-12-3937
  • Frank, L., & Friedman, J. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2), 10–135. https://doi.org/10.1080/00401706.1993.10485033
  • Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. Journal of Computational and Graphical Statistics, 7(3), 397–416. https://doi.org/10.1080/10618600.1998.10474784
  • He, K., Lian, H., Ma, S., & Huang, J. Z. (2018). Dimensionality reduction and variable selection in multivariate varying-coefficient models with a large number of covariates. Journal of Statistical Planning and Inference, 113(522), 746–754. https://doi.org/10.1080/01621459.2017.1285774
  • Jia, B., Xu, S., Xiao, G., & Lambda, V. (2017). Learning gene regulatory networks from next generation sequencing data. Biometrics, 73(4), 1221–1230. https://doi.org/10.1111/biom.v73.4
  • Kim, S., Sohn, K., & Xing, E. (2009). A multivariate regression approach to association analysis of a quantitative trait network. Bioinformatics (Oxford, England), 25(12), 204–212. https://doi.org/10.1093/bioinformatics/btp218
  • Kong, X., Liu, Z., Yao, Y., & Zhou, W. (2017). Sure screening by ranking the canonical correlations. Test, 26(1), 46–70. https://doi.org/10.1007/s11749-016-0497-z
  • Kong, Y., Zheng, Z., & Lv, J. (2016). The constrained Dantzing selector with enhanced consistency. Journal of Machine Learning Research, 17(123), 1–22.
  • Lee, W., & Liu, Y. (2012). Simultaneous multiple response regression and inverse covariate matrix estimation via penalized Gaussian maximum likelihood. Journal of Multivariate Analysis, 111, 241–255. https://doi.org/10.1016/j.jmva.2012.03.013
  • Li, B., Chuns, H., & Zhao, H. (2012). Sparse estimation of conditional graphical models with application to gene networks. Journal of American Statistical Association, 107(497), 152–167. https://doi.org/10.1080/01621459.2011.644498
  • Li, C., & Li, H. (2008). Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics (Oxford, England), 24(9), 1175–1182. https://doi.org/10.1093/bioinformatics/btn081
  • Li, G., Peng, H., Zhang, J., & Zhu, L. (2012). Robust rank correlation based screening. Annals of Statistics, 40(3), 1846–1877. https://doi.org/10.1214/12-AOS1024
  • Li, Y., Li, G., Lian, H., & Tong, T. (2017). Profile forward regression screening for ultra-high dimensional semiparametric varying coefficient partially linear models. Journal of Multivariate Analysis, 155, 133–150. https://doi.org/10.1016/j.jmva.2016.12.006
  • Li, Y., Nan, B., & Zhu, J. (2015). Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure. Biometrics, 71(2), 354–363. https://doi.org/10.1111/biom.v71.2
  • Liang, H., Wang, H., & Tsai, C.-L. (2012). Profiled forward regression for ultrahigh dimensional variable screening in semiparametric partially linear model. Statistica Sinica, 22(2), 531–554. https://doi.org/10.5705/ss.2010.134
  • Obozinski, G., Wainwright, M. J., & Jordan, M. I. (2011). Support union recovery in high-dimensional multivariate regression. Annals of Statistics, 39(1), 1–47. https://doi.org/10.1214/09-AOS776
  • Pecanka, J., van der Vaart, A. W., & Marianne, J. (2019). Modeling association between multivariate correlated and high-dimensional sparse covariates: The adaptive SVS method. Journal of Applied Statistics, 46(5), 893–913. https://doi.org/10.1080/02664763.2018.1523377
  • Peng, J., Zhu, J., Bergamaschi, A., Han, W., Noh, D.-Y., Pollack, J. R., & Wang, P. (2010). Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. Annals of Applied Statistics, 4(1), 53–77. https://doi.org/10.1214/09-AOAS271
  • Ravikumar, P., Wainwright, M., & Lafferty, J. (2010). High-dimensional Ising model selection using l1 regularized logistic regression. Annals of Statistics, 38(3), 1287–1319. https://doi.org/10.1214/09-AOS691
  • Ren, J., Du, Y., Li, S., Ma, S., Jiang, Y., & Wu, C. (2019). Robust network-based regularization and variable selection for high-dimensional genomic data in cancer prognosis. Genetic Epidemiology, 43(3), 276–291. https://doi.org/10.1002/gepi.2018.43.issue-3
  • Ren, J., He, T., Li, Y., Liu, S., Du, Y., Jiang, Y., & Wu, C. (2017). Network-based regularization for high dimensional SNP data in the case-control study of type 2 diabetes. BMC Genetics, 18(1), 44. https://doi.org/10.1186/s12863-017-0495-5
  • Reppe, S., Refvem, H., Gautvik, V. T., Olstad, O. K., H∅vring, P. I., Reinholt, F. P., Holden, M., Frigessi, A., Jemtland, R., & Gautvik, K. M. (2010). Eight genes are highly associated with BMD variation in postmenopausal Caucasian women. Bone, 46(3), 604–612. https://doi.org/10.1016/j.bone.2009.11.007
  • Rothman, A. J., Levina, E., & Zhu, J. (2010). Sparse multivariate regression with covariance estimation. Journal of Computational and Graphical Statistics, 19(4), 947–962. https://doi.org/10.1198/jcgs.2010.09188
  • Saulis, L., & Statulevicius, V. (1991). Limit theorems for large deviations (Vol. 73). Springer Science & Business Media.
  • Setdji, C. M., & R. D. Cook (2004). K-means inverse regression. Technometrics, 46(4), 421–429. https://doi.org/10.1198/004017004000000437
  • Smith, M., & Fahrmeir, L. (2007). Spatial Bayesian variable selection with application to functional magnetic resonance imaging. Journal of American Statistical Association, 102(478), 417–431. https://doi.org/10.1198/016214506000001031
  • Sofer, T., Dicker, L., & Lin, X. (2014). Variable selection for high dimensional multivariate outcomes. Statistica Sinica, 24(4), 1633–1654. https://doi.org/10.5705/ss.2013.019
  • Song, Y., Schreier, P. J., Ramirez, D., & Hasija, T. (2016). Canonical correlation analysis of high-dimensional data with very small sample support. Signal Processing, 128, 449–458. https://doi.org/10.1016/j.sigpro.2016.05.020
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  • Turlach, B., Venables, W., & Wright, S. (2005). Simultaneous variable selection. Technometrics, 47(3), 349–363. https://doi.org/10.1198/004017005000000139
  • Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104(488), 1512–1524. https://doi.org/10.1198/jasa.2008.tm08516
  • Wang, J., Zhang, Z., & Ye, J. (2019). Two-layer feature reduction for sparse-group lasso via decomposition of convex sets. Journal of Machine Learning, 20(163), 1–42.
  • Yang, Y., & Zou, H. (2015). A fast unified algorithm for solving group-lasso penalize learning problems. Statistics and Computing, 25(6), 1129–1141. https://doi.org/10.1007/s11222-014-9498-5
  • Yi, N. (2010). Statistical analysis of genetic interactions. Genetics Research, 92(5–6), 443–459. https://doi.org/10.1017/S0016672310000595
  • Yin, J., & Li, H. (2011). A sparse conditional Gaussian graphical model for analysis of genetical genomics data. Annals of Applied Statistics, 5(4), 2630–2650. https://doi.org/10.1214/11-AOAS494
  • Zhang, C., & Huang, J. (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression. Annals of Statistics, 36(4), 1567–1594. https://doi.org/10.1214/07-AOS520
  • Zhang, H. H., & Lu, W. (2007). Adaptive lasso for Cox's proportional hazards model. Biometrika, 94(3), 691–703. https://doi.org/10.1093/biomet/asm037
  • Zhang, N., Yu, Z., & Wu, Q. (2019). Overlapping sliced inverse regression for dimension reduction. Analysis and Applications, 17(5), 715–736. https://doi.org/10.1142/S0219530519400013
  • Zhao, W., Lian, H., & Ma, S. (2017). Robust reduced–rank modeling via rank regression. Journal of Statistical Planning and Inference, 180, 1–12. https://doi.org/10.1016/j.jspi.2016.08.009
  • Zhu, L., Li, L., Li, R., & Zhu, L. (2011). Model-free feature screening for ultrahigh-dimensional data. Journal of the American Statistical Association, 106(496), 1464–1475. https://doi.org/10.1198/jasa.2011.tm10563
  • Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/rssb.2005.67.issue-2