2,996
Views
20
CrossRef citations to date
0
Altmetric
Theory and Methods

Estimating Number of Factors by Adjusted Eigenvalues Thresholding

, &
Pages 852-861 | Received 14 Aug 2019, Accepted 11 Sep 2020, Published online: 19 Nov 2020
 

Abstract

Determining the number of common factors is an important and practical topic in high-dimensional factor models. The existing literature is mainly based on the eigenvalues of the covariance matrix. Owing to the incomparability of the eigenvalues of the covariance matrix caused by the heterogeneous scales of the observed variables, it is not easy to find an accurate relationship between these eigenvalues and the number of common factors. To overcome this limitation, we appeal to the correlation matrix and demonstrate, surprisingly, that the number of eigenvalues greater than 1 of the population correlation matrix is the same as the number of common factors under certain mild conditions. To use such a relationship, we study random matrix theory based on the sample correlation matrix to correct biases in estimating the top eigenvalues and to take into account of estimation errors in eigenvalue estimation. Thus, we propose a tuning-free scale-invariant adjusted correlation thresholding (ACT) method for determining the number of common factors in high-dimensional factor models, taking into account the sampling variabilities and biases of top sample eigenvalues. We also establish the optimality of the proposed ACT method in terms of minimal signal strength and the optimal threshold. Simulation studies lend further support to our proposed method and show that our estimator outperforms competing methods in most test cases. Supplementary materials for this article are available online.

Supplementary Materials

Title: Supplementary materials for “Estimating Number of Factors by Adjusted Eigenvalues Thresholding.” The supplementary materials include nine lemmas and their proofs, the proofs of Theorems 1–3 and simulation results for uniform population. (SuppleFileFactor.pdf)

R codes for ACT: R codes are utilized for simulation studies in Section 5 and empirical studies in Section 6 (simuexam zipped file).

Datasets: Datasets are utilized in empirical studies in Section 6. (data zipped file)

Acknowledgments

The authors thank the editor, the associated editor, and the referees for their constructive comments and suggestions that led to a significant improvement of this article.

Additional information

Funding

The authors gratefully acknowledge National Natural Science Foundation of China (NNSFC) grants 12071066, 11631003, 11690012, 71991470, and 71991471.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.