ABSTRACT
We develop methods to estimate the number of factors when error terms have potentially strong correlations in the cross-sectional dimension. The information criteria proposed by Bai and Ng (Citation2002) require the cross-sectional correlations between the error terms to be weak. Violation of this weak correlation assumption may lead to inconsistent estimates of the number of factors. We establish two data-dependent estimators that are consistent whether the error terms are weakly or strongly correlated in the cross-sectional dimension. To handle potentially strong cross-sectional correlations between the error terms, we use a block structure in which the within-block correlation may either be weak or strong, but the between-block correlation is limited. Our estimators allow imperfect knowledge and a moderate misspecification of the block structure. Monte-Carlo simulation results show that our estimators perform similarly to existing methods for cases in which the conventional weak correlation assumption is satisfied. When the error terms have a strong cross-sectional correlation, our estimators outperform the existing methods.
KEYWORDS:
Acknowledgments
The authors would like to thank Patrik Guggenberger, Serena Ng, and Jack Silverstein for their helpful conversations, and the participants at the 2011 Meetings of the Midwest Econometrics Group for their helpful comments.
Notes
1In the framework of dynamic factor models, which is out of the scope of this paper, Bai and Ng (Citation2007), Hallin and Liska (Citation2007), and Watson and Amengual (Citation2007) proposed estimation procedures to determine the number of dynamic factors. These methods are also based on the weak correlation assumption of error terms.
2This is implied by the assumption that the largest eigenvalues of and for g = 1,…,G are uniformly bounded. See Onatski (Citation2010).
3As and span the same space, .
4The functional form of β(N,T) is of course not unique. For example, ln(min(N,T)) also satisfies the slowly diverging condition. However, simulation results show that setting β(N,T) = ln(min(N,T)) always underestimates the number of factors in small samples, so we do not apply this functional form for the penalty term in MPC.
5Stock and Watson (Citation1998) showed that Hk converges to a constant matrix with rank k.
6The computation of μ takes into account the fact that and .
7As J = 3 in this subcase, the ith cross section is not correlated with the (i+j)th cross section for |j|>6. To form the completely wrong block structure when N = 100 and B = Mb = 10, we use i = b, 10+b, 20+b,…,90+b to form the bth block for b = 1,2,…,10. This ensures that the within-block cross-sectional correlation between ei and ej is zero for all i≠j.
8We also run simulations for τ = 0.5. The results are available in the supplement appendix (Han and Caner, 2016). When τ = 0.5, the cross-sectional correlation becomes so strong that ICp1 always overestimates the number of factors. The ED estimator performs well for r≤3, but tends to underestimate the number of factors for r≥5. MPC and LUB perform well for r = 1 and 3. For r≥5, both MPC and LUB tend to underestimate, but the downward biases are smaller than that of ED. When τ = 0.5 and r = 7, for example, the mean (mode) of ED is 1.08 (1), whereas the means (modes) of MPC and LUB are 3.15 (3) and 3.65 (4), respectively.