Search in:

Journal of the American Statistical Association Volume 119, 2024 - Issue 546

Submit an article Journal homepage

2,547

Views

CrossRef citations to date

Altmetric

Theory and Methods

Are Latent Factor Regression and Sparse Regression Adequate?

Jianqing Fana School of Data Science, Fudan University, Shanghai, China;b Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJCorrespondence[email protected]
View further author information

Zhipeng Loub Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJView further author information

Mengxin Yub Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ

https://orcid.org/0000-0002-6818-4083 View further author information

Pages 1076-1088 | Received 13 Jan 2022, Accepted 13 Jan 2023, Published online: 14 Feb 2023

Cite this article
https://doi.org/10.1080/01621459.2023.2169700
CrossMark

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Reprints & Permissions

References

Ahn, S. C., and Horenstein, A. R. (2013), “Eigenvalue Ratio Test for the Number of Factors,” Econometrica, 81, 1203–1227.
Web of Science ®Google Scholar
Avella-Medina, M., Battey, H. S., Fan, J., and Li, Q. (2018), “Robust Estimation of High-Dimensional Covariance and Precision Matrices,” Biometrika, 105, 271–284. DOI: 10.1093/biomet/asy011.
PubMed Web of Science ®Google Scholar
Bai, J. (2003), “Inferential Theory for Factor Models of Large Dimensions,” Econometrica, 71, 135–171. DOI: 10.1111/1468-0262.00392.
Web of Science ®Google Scholar
Bai, J., and Li, K. (2012), “Statistical Analysis of Factor Models of High Dimension.,” The Annals of Statistics, 40, 436–465. DOI: 10.1214/11-AOS966.
Web of Science ®Google Scholar
Bai, J., and Ng, S. (2002), “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70, 191–221. DOI: 10.1111/1468-0262.00273.
Web of Science ®Google Scholar
Bai, J. (2006), “Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions,” Econometrica, 74, 1133–1150.
Web of Science ®Google Scholar
Bai, J. (2008), “Forecasting Economic Time Series Using Targeted Predictors,” Journal of Econometrics, 146, 304–317. DOI: 10.1016/j.jeconom.2008.08.010.
Web of Science ®Google Scholar
Bair, E., Hastie, T., Paul, D., and Tibshirani, R. (2006), “Prediction by Supervised Principal Components,” Journal of the American Statistical Association, 101, 119–137. DOI: 10.1198/016214505000000628.
Web of Science ®Google Scholar
Barut, E., Fan, J., and Verhasselt, A. (2016), “Conditional Sure Independence Screening,” Journal of the American Statistical Association, 111, 1266–1277. DOI: 10.1080/01621459.2015.1092974.
PubMed Web of Science ®Google Scholar
Belloni, A., and Chernozhukov, V. (2011), “l1-Penalized Quantile Regression in High-Dimensional Sparse Models,” The Annals of Statistics, 39, 82–130.
Web of Science ®Google Scholar
Bianchi, D., Büchner, M., and Tamoni, A. (2021), “Bond Risk Premiums with Machine Learning,” The Review of Financial Studies, 34, 1046–1089. DOI: 10.1093/rfs/hhaa062.
Web of Science ®Google Scholar
Bing, X., Bunea, F., and Wegkamp, M. (2019), “Inference in Latent Factor Regression with Clusterable Features,” arXiv:1905.12696.
Google Scholar
Bing, X., Bunea, F., Strimas-Mackey, S., and Wegkamp, M. (2021), “Prediction under Latent Factor Regression: Adaptive pcr, Interpolating Predictors and Beyond,” Journal of Machine Learning Research, 22, 1–50.
Web of Science ®Google Scholar
Bunea, F., Strimas-Mackey, S., and Wegkamp, M. (2020), “Interpolating Predictors in High-Dimensional Factor Regression,” arXiv:2002.02525.
Google Scholar
Cai, T., Liu, W., and Luo, X. (2011), “A Constrained l1 Minimization Approach to Sparse Precision Matrix Estimation,” Journal of the American Statistical Association, 106, 594–607.
Web of Science ®Google Scholar
Candes, E., and Tao, T. (2007), “The Dantzig Selector: Statistical Estimation When p is much Larger than n,” The Annals of Statistics, 35, 2313–2351. DOI: 10.1214/009053606000001523.
Web of Science ®Google Scholar
Chernozhukov, V., Chetverikov, D., and Kato, K. (2013), “Gaussian Approximations and Multiplier Bootstrap for Maxima of Sums of High-Dimensional Random Vectors,” The Annals of Statistics, 41, 2786–2819. DOI: 10.1214/13-AOS1161.
Web of Science ®Google Scholar
Chernozhukov, V. (2017), “Central Limit Theorems and Bootstrap in High Dimensions,” Annals of Probability, 45, 2309–2352.
Web of Science ®Google Scholar
Chernozhukov, V., Chetverikov, D., and Koike, Y. (2020), “Nearly Optimal Central Limit Theorem and Bootstrap Approximations in High Dimensions,” arXiv preprint arXiv:2012.09513.
Google Scholar
Chu, W., Li, R., and Reimherr, M. (2016), “Feature Screening for Time-Varying Coefficient Models with Ultrahigh Dimensional Longitudinal Data,” The Annals of Applied Statistics, 10, 596–617. DOI: 10.1214/16-AOAS912.
PubMed Web of Science ®Google Scholar
Coulombe, P. G., Leroux, M., Stevanovic, D., and Surprenant, S. (2021), “Macroeconomic Data Transformations Matter,” International Journal of Forecasting, 37, 1338–1354. DOI: 10.1016/j.ijforecast.2021.05.005.
Web of Science ®Google Scholar
Coulombe, P. G., Marcellino, M., and Stevanović, D. (2021), “Can Machine Learning Catch the Covid-19 Recession?” National Institute Economic Review, 256, 71–109. DOI: 10.1017/nie.2021.10.
Web of Science ®Google Scholar
Dezeure, R., Bühlmann, P., and Zhang, C.-H. (2017), “High-Dimensional Simultaneous Inference with the Bootstrap,” Test, 26, 685–719. DOI: 10.1007/s11749-017-0554-2.
Web of Science ®Google Scholar
Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004), “Least Angle Regression,” The Annals of Statistics, 32, 407–499. DOI: 10.1214/009053604000000067.
Web of Science ®Google Scholar
Fan, J., and Li, R. (2001), “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American Statistical Association, 96, 1348–1360. DOI: 10.1198/016214501753382273.
Web of Science ®Google Scholar
Fan, J., and Liao, Y. (2020), “Learning Latent Factors from Diversified Projections and its Applications to Over-Estimated and Weak Factors,” Journal of the American Statistical Association, 117, 909–924. DOI: 10.1080/01621459.2020.1831927.
Web of Science ®Google Scholar
Fan, J., and Lv, J. (2008), “Sure Independence Screening for Ultrahigh Dimensional Feature Space,” Journal of the Royal Statistical Society, Series B, 70, 849–911. DOI: 10.1111/j.1467-9868.2008.00674.x.
PubMed Web of Science ®Google Scholar
Fan, J. (2011), “Nonconcave Penalized Likelihood with NP-Dimensionality,” IEEE Transactions on Information Theory, 57, 5467–5484.
PubMed Web of Science ®Google Scholar
Fan, J., and Song, R. (2010), “Sure Independence Screening in Generalized Linear Models with NP-Dimensionality,” The Annals of Statistics, 38, 3567–3604. DOI: 10.1214/10-AOS798.
Web of Science ®Google Scholar
Fan, J., Guo, S., and Hao, N. (2012), “Variance Estimation Using Refitted Cross-Validation in Ultrahigh Dimensional Regression,” Journal of the Royal Statistical Society, Series B, 74, 37–65. DOI: 10.1111/j.1467-9868.2011.01005.x.
Google Scholar
Fan, J., Liao, Y., and Mincheva, M. (2013), “Large Covariance Estimation by Thresholding Principal Orthogonal Complements,” Journal of the Royal Statistical Society, Series B, 75, 603–680. With 33 discussions by 57 authors and a reply by Fan, Liao and Mincheva. DOI: 10.1111/rssb.12016.
Google Scholar
Fan, J., Fan, Y., and Barut, E. (2014), “Adaptive Robust Variable Selection,” The Annals of Statistics, 42, 324–351. DOI: 10.1214/13-AOS1191.
PubMed Web of Science ®Google Scholar
Fan, J., Li, Q., and Wang, Y. (2017), “Estimation of High Dimensional Mean Regression in the Absence of Symmetry and Light Tail Assumptions,” Journal of the Royal Statistical Society, Series B, 79, 247–265. DOI: 10.1111/rssb.12166.
Google Scholar
Fan, J., Xue, L., and Yao, J. (2017), “Sufficient Forecasting Using Factor Models,” Journal of Econometrics, 201, 292–306. DOI: 10.1016/j.jeconom.2017.08.009.
PubMed Web of Science ®Google Scholar
Fan, J., Ke, Y., and Wang, K. (2020), “Factor-Adjusted Regularized Model Selection,” Journal of Econometrics, 216, 71–85. DOI: 10.1016/j.jeconom.2020.01.006.
PubMed Web of Science ®Google Scholar
Fan, J., Li, R., Zhang, C.-H., and Zou, H. (2020), “Statistical Foundations of Data Science, Boca Raton, FL: CRC Press.
Google Scholar
Fan, J., Masini, R., and Medeiros, M. C. (2021), “Bridging Factor and Sparse Models,” arXiv:2102.11341.
Google Scholar
Fan, J., Yang, Z., and Yu, M. (2021), “Understanding Implicit Regularization in Over-Parameterized Single Index Model,” arXiv:2007.08322v3. DOI: 10.1080/01621459.2022.2044824.
Google Scholar
Fan, J., Guo, J., and Zheng, S. (2022), “Estimating Number of Factors by Adjusted Eigenvalues Thresholding,” Journal of the American Statistical Association, 117, 852–861. DOI: 10.1080/01621459.2020.1825448.
Web of Science ®Google Scholar
Giannone, D., Lenza, M., and Primiceri, G. E. (2021), “Economic Predictions with Big Data: The Illusion of Sparsity,” Econometrica, 89, 2409–2437. DOI: 10.3982/ECTA17842.
Web of Science ®Google Scholar
Goulet Coulombe, P. (2020), “The Macroeconomy as a Random Forest,” Available at SSRN 3633110.
Google Scholar
Hall, A. S. (2018), “Machine Learning Approaches to Macroeconomic Forecasting,” The Federal Reserve Bank of Kansas City Economic Review, 103, 63–81.
Google Scholar
Hernan, M., and Robins, J. (2019), Causal Inference. Chapman & Hall/CRC Monographs on Statistics & Applied Probability, Boca Raton, FL: CRC Press.
Google Scholar
Hotelling, H. (1933), “Analysis of a Complex of Statistical Variables into Principal Components,” Journal of Educational Psychology, 24, 417–441. DOI: 10.1037/h0071325.
Google Scholar
Imbens, G., and Rubin, D. (2015), Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction, New York: Cambridge University Press.
Google Scholar
Javanmard, A., and Montanari, A. (2014), “Confidence Intervals and Hypothesis Testing for High-Dimensional Regression,” Journal of Machine Learning Research, 15, 2869–2909.
Web of Science ®Google Scholar
Jolliffe, I. T. (1982), “A Note on the Use of Principal Components in Regression,” Journal of the Royal Statistical Society, Series C, 31, 300–303. DOI: 10.2307/2348005.
Web of Science ®Google Scholar
Kneip, A., and Sarda, P. (2011), “Factor Models and Variable Selection in High-Dimensional Regression Analysis,” The Annals of Statistics, 39, 2410–2447. DOI: 10.1214/11-AOS905.
Web of Science ®Google Scholar
Lam, C., and Yao, Q. (2012), “Factor Modeling for High-Dimensional Time Series: Inference for the Number of Factors,” The Annals of Statistics, 40, 694–726. DOI: 10.1214/12-AOS970.
Web of Science ®Google Scholar
Li, G., Peng, H., Zhang, J., and Zhu, L. (2012), “Robust Rank Correlation based Screening,” The Annals of Statistics, 40, 1846–1877. DOI: 10.1214/12-AOS1024.
Web of Science ®Google Scholar
Li, Q., and Li, L. (2021), “Integrative Factor Regression and its Inference for Multimodal Data Analysis,” Journal of the American Statistical Association, 113, 1–15.
Google Scholar
Li, Q., Cheng, G., Fan, J., and Wang, Y. (2018), “Embracing the Blessing of Dimensionality in Factor Models,” Journal of the American Statistical Association, 113, 380–389. DOI: 10.1080/01621459.2016.1256815.
PubMed Web of Science ®Google Scholar
Lin, J., and Michailidis, G. (2020), “System Identification of High-Dimensional Linear Dynamical Systems with Serially Correlated Output Noise Components,” IEEE Transactions on Signal Processing, 68, 5573–5587. DOI: 10.1109/TSP.2020.3020397.
Web of Science ®Google Scholar
Liu, J., Li, R., and Wu, R. (2014), “Feature Selection for Varying Coefficient Models with Ultrahigh-Dimensional Covariates,” Journal of the American Statistical Association, 109, 266–274. DOI: 10.1080/01621459.2013.850086.
PubMed Web of Science ®Google Scholar
Loh, P.-L., and Wainwright, M. J. (2012), “High-Dimensional Regression with Noisy and Missing Data: Provable Guarantees with Nonconvexity,” The Annals of Statistics, 40, 1637–1664. DOI: 10.1214/12-AOS1018.
Web of Science ®Google Scholar
Luciani, M. (2014), “Forecasting with Approximate Dynamic Factor Models: The Role of Non-Pervasive Shocks,” International Journal of Forecasting, 30, 20–29. DOI: 10.1016/j.ijforecast.2013.05.001.
Web of Science ®Google Scholar
McCracken, M. W., and Ng, S. (2016), “Fred-md: A Monthly Database for Macroeconomic Research,” Journal of Business & Economic Statistics, 34, 574–589. DOI: 10.1080/07350015.2015.1086655.
Web of Science ®Google Scholar
Nickl, R., and Van De Geer, S. (2013), “Confidence Sets in Sparse Regression,” The Annals of Statistics, 41, 2852–2876. DOI: 10.1214/13-AOS1170.
Web of Science ®Google Scholar
Peng, B., Wang, L., and Wu, Y. (2016), “An Error Bound for L1-norm Support Vector Machine Coefficients in Ultra-High Dimension,” Journal of Machine Learning Research, 17, 8279–8304.
Web of Science ®Google Scholar
Saldana, D. F., and Feng, Y. (2018), “Sis: An r Package for Sure Independence Screening in Ultrahigh-Dimensional Statistical Models,” Journal of Statistical Software, 83, 1–25. DOI: 10.18637/jss.v083.i02.
Web of Science ®Google Scholar
Shi, C., Song, R., Chen, Z., and Li, R. (2019), “Linear Hypothesis Testing for High Dimensional Generalized Linear Models,” The Annals of Statistics, 47, 2671–2703. DOI: 10.1214/18-AOS1761.
PubMed Web of Science ®Google Scholar
Smeekes, S., and Wijler, E. (2018), “Macroeconomic Forecasting Using Penalized Regression Methods,” International Journal of Forecasting, 34, 408–430. DOI: 10.1016/j.ijforecast.2018.01.001.
Web of Science ®Google Scholar
Stock, J. H., and Watson, M. W. (2002), “Forecasting Using Principal Components from a Large Number of Predictors,” Journal of the American Statistical Association, 97, 1167–1179. DOI: 10.1198/016214502388618960.
Web of Science ®Google Scholar
Sun, Q., Zhou, W.-X., and Fan, J. (2020), “Adaptive Huber Regression,” Journal of the American Statistical Association, 115, 254–265. DOI: 10.1080/01621459.2018.1543124.
PubMed Web of Science ®Google Scholar
Sun, T., and Zhang, C.-H. (2012), “Scaled Sparse Linear Regression,” Biometrika, 99, 879–898. DOI: 10.1093/biomet/ass043.
Web of Science ®Google Scholar
Tibshirani, R. (1996), “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society, Series A, 58, 267–288. DOI: 10.1111/j.2517-6161.1996.tb02080.x.
Web of Science ®Google Scholar
Van de Geer, S. (2008), “High-Dimensional Generalized Linear Models and the Lasso,” The Annals of Statistics, 36, 614–645. DOI: 10.1214/009053607000000929.
Web of Science ®Google Scholar
van de Geer, S., Bühlmann, P., Ritov, Y., and Dezeure, R. (2014), “On Asymptotically Optimal Confidence Regions and Tests for High-Dimensional Models,” The Annals of Statistics, 42, 1166–1202. DOI: 10.1214/14-AOS1221.
Web of Science ®Google Scholar
Wang, W., and Fan, J. (2017), “Asymptotics of Empirical Eigenstructure for High Dimensional Spiked Covariance,” The Annals of Statistics, 45, 1342–1374. DOI: 10.1214/16-AOS1487.
PubMed Web of Science ®Google Scholar
Wang, X., and Leng, C. (2016), “High Dimensional Ordinary Least Squares Projection for Screening Variables,” Journal of the Royal Statistical Society, Series B, 78, 589–611. DOI: 10.1111/rssb.12127.
Google Scholar
Yu, G., and Bien, J. (2019), “Estimating the Error Variance in a High-Dimensional Linear Model,” Biometrika, 106, 533–546. DOI: 10.1093/biomet/asz017.
Web of Science ®Google Scholar
Zhang, C.-H. (2010), “Nearly Unbiased Variable Selection under Minimax Concave Penalty,” The Annals of Statistics, 38, 894–942. DOI: 10.1214/09-AOS729.
Web of Science ®Google Scholar
Zhang, C.-H., and Zhang, S. S. (2014), “Confidence Intervals for Low Dimensional Parameters in High Dimensional Linear Models,” Journal of the Royal Statistical Society, Series B, 76, 217–242. DOI: 10.1111/rssb.12026.
Web of Science ®Google Scholar
Zhang, N., Jiang, W., and Lan, Y. (2019), “On the Sure Screening Properties of Iteratively Sure Independence Screening Algorithms,” arXiv:1812.01367.
Google Scholar
Zhang, X., and Cheng, G. (2017), “Simultaneous Inference for High-Dimensional Linear Models,” Journal of the American Statistical Association, 112, 757–768. DOI: 10.1080/01621459.2016.1166114.
Web of Science ®Google Scholar
Zhang, X., Wu, Y., Wang, L., and Li, R. (2016), “Variable Selection for Support Vector Machines in Moderately High Dimensions,” Journal of the Royal Statistical Society, Series B, 78, 53–76. DOI: 10.1111/rssb.12100.
Google Scholar
Zhao, P., and Yu, B. (2006), “On Model Selection Consistency of Lasso,” Journal of Machine Learning Research, 7, 2541–2563.
Web of Science ®Google Scholar
Zhao, P., Yang, Y., and He, Q.-C. (2019), “Implicit Regularization via Hadamard Product Over-Parametrization in High-Dimensional Linear Regression,” arXiv:1903.09367.
Google Scholar
Zhu, L.-P., Li, L., Li, R., and Zhu, L.-X. (2011), “Model-Free Feature Screening for Ultrahigh-Dimensional Data,” Journal of the American Statistical Association, 106, 1464–1475. DOI: 10.1198/jasa.2011.tm10563.
PubMed Web of Science ®Google Scholar
Zou, H. (2006), “The Adaptive Lasso and its Oracle Properties,” Journal of the American Statistical Association, 101, 1418–1429. DOI: 10.1198/016214506000000735.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Are Latent Factor Regression and Sparse Regression Adequate?

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Are Latent Factor Regression and Sparse Regression Adequate?

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date