3,483
Views
1
CrossRef citations to date
0
Altmetric
Articles

Estimation of Panel Data Models with Random Interactive Effects and Multiple Structural Breaks when T is Fixed

&
 

Abstract

In this article, we propose a new estimator of panel data models with random interactive effects and multiple structural breaks that is suitable when the number of time periods, T, is fixed and only the number of cross-sectional units, N, is large. This is done by viewing the determination of the breaks as a shrinkage problem, and to estimate both the regression coefficients, and the number of breaks and their locations by applying a version of the Lasso approach. We show that with probability approaching one the approach can correctly determine the number of breaks and the dates of these breaks, and that the estimator of the regime-specific regression coefficients is consistent and asymptotically normal. We also provide Monte Carlo results suggesting that the approach performs very well in small samples, and empirical results suggesting that while the coefficients of the controls are breaking, the coefficients of the main deterrence regressors in a model of crime are not.

Supplementary Materials

The supplement provides (i) proofs of the asymptotic results provided in Section 3.2 of the main article, (ii) details of the extensions mentioned in Sections 2 and 3.1 of the same article, and (iii) some Monte Carlo results pertaining to these extensions.

Acknowledgments

The authors would like to thank Ivan Canay (Coeditor), an Associate Editor and two anonymous referees for many valuable comments and suggestions.

Notes

1 An incomplete list of studies dealing with a single structural break in panel data include Antoch et al. (Citation2019), Baltagi, Feng, and Kao (Citation2016), Baltagi, Kao, and Liu (Citation2017), Hidalgo and Schafgans (Citation2017), Karavias, Narayan, and Westerlund (Citation2022), and Zhu, Sarafidis, and Silvapulle (Citation2020).

2 Qian and Su (Citation2016) recognize the importance of allowing T to be finite and discuss likely implications for theory, but they do not provide any formal results for the fixed-T case. Similarly, while in Baltagi, Feng, and Kao (Citation2016) there is a discussion of how to proceed in the presence of multiple breaks, their theory supposes that there is just one break.

3 This condition can be restrictive but it is need for the proofs; see Section 3.1 for a discussion.

4 As we explain later in Section 3, the type of factors that can be permitted under our assumptions is very broad. This suggests that there is no need to discriminate between known and unknown factors, but that one can just as well treat them all as unknown. This is the main rationale for writing (2.2) in terms of (the unknown) ft only.

5 One can also use the regular Lasso estimator of Am00 , as given by [β̂T̂0,,β̂T̂m̂]. However, as is well known in the literature, post-Lasso typically outperforms regular Lasso, and our (unreported) Monte Carlo results confirm this. In this article, we therefore, focus on post-Lasso LS.

6 The need for this condition is partly expected given the discussion in Section 1 on the difficulty of separating the breaks from the interactive effects. Boldea, Drepper, and Gan (Citation2020) do not do anything to control for the interactive effects but apply LS as if there were no effects present at all. This means that they have to put enough structure on the effects so as to ensure that they do not interfere with their break estimation procedure. One of the terms in the resulting omitted interactive effects bias of the LS estimator is given by N1i=1Nxi,tλift. If this is not constant within break regimes, the interactive effects will be mistaken for structural breaks.

7 Certain low-rank regressors can be permitted but they then require special treatment (see Bai Citation2009).

8 The condition that the regressors are identically distributed can be relaxed (see Boldea, Drepper, and Gan Citation2020). However, it is still necessary that the sample second moment matrix of the regressors is asymptotically time-invariant (within break regimes).

9 We experimented with differently spaced breaks. The results, available upon request, suggests that the conclusions are unaffected by the spacing of the breaks and that the PDL2S approach works well even if there are regimes that contains only one time period, which corroborates our asymptotic results.

10 The cross-sectional sum in νi,t is truncated at beginning and end when not enough cross-sections are available. For example, when generating ν1,t , the sum only includes e2,t,,e11,t.

11 The data can be downloaded online from the Journal of Applied Econometrics data archive, available at http://qed.econ.queensu.ca/jae/.

12 We refer to Cornwell and Trumbull (Citation1994) for a more detailed description of the data.

13 Most important, there was (i) the election of Ronald Reagan in 1980 and the political-economic reorganization that followed, (ii) a displacement of nonwhite inner-city males from the regular labor force to the criminogenic informal drug economy, and (iii) a steep increase in juvenile violent crime.

14 Juodis and Reese (Citation2022) argue that the CD test can be misleading when applied to cross-sectionally demeaned data in that it will tend to reject too often. We are unable to reject, suggesting that this tendency to over-reject is not an issue.

15 Unreported results confirm that the regressors are not uncorrelated across counties. This means that misspecification of the regression function, such as when breaks are omitted or misplaced, should manifest itself as cross-correlated residuals, which is not what we find.

16 Hence, there are two instruments, one for each of the two endogenous regressors. This means that the model is just identified. We experimented with using the one-year lagged values of PRBARR and POLPC as additional instruments. Because the resulting model is overidentified, we can apply the overidentifying restrictions J-statistic to assess the validity of the instruments. The instruments passed the test. The problem is that the lags do not appear to be very relevant, in that PRBARR and POLPC are basically serially uncorrelated, which casts doubt on the results based on the larger instrument set. For this reason, we follow the previous literature and focus on the just-identified model specification. All other regressors are treated as exogenous and are therefore, included in the set of instruments.

Additional information

Funding

Westerlund would like to thank the Knut and Alice Wallenberg Foundation for financial support through a Wallenberg Academy Fellowship.