560
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Panel Data Cointegration Testing with Structural Instabilities

&

Abstract

Spurious regression analysis in panel data when the time series are cross-section dependent is analyzed in the article. The set-up includes (possibly unknown) multiple structural breaks that can affect both the deterministic and the common factor components. We show that consistent estimation of the long-run average parameter is possible once cross-section dependence is controlled using cross-section averages in the spirit of Pesaran’s common correlated effects approach. This result is used to design individual and panel cointegration test statistics that accommodate the presence of structural breaks that can induce parameter instabilities in the deterministic component, the cointegration vector and the common factor loadings.

1 Introduction

The literature on nonstationary panel data has experienced important developments in recent times, analyzing the properties of parameter estimation under both spurious regression and cointegration. One interesting feature of working in a nonstationary panel data framework is that it is possible to obtain consistent estimates of statistical relationships regardless of whether there is cointegration or spurious regression—see Phillips and Moon (Citation1999). The idea behind the consistent estimation of (the long-run average) parameters in a spurious panel relies on the fact that panel units add information that lead to detect a stronger overall signal than that of the pure time series case—see Phillips and Moon (2000). This article adds to this body of the panel data literature by defining a model specification under spurious regression that allows for the presence of multiple structural breaks affecting the level, trend and/or the slope parameters of the model, as well as in the factor structure generating the cross-section dependence. Panel data units are assumed to be cross-section dependent which is captured by an approximate common factor model.

There are a number of contributions in the literature that are relevant for our proposal. Gengenbach, Urbain, and Westerlund (Citation2016) and Banerjee and Carrion-i-Silvestre (2017) design panel data cointegration tests with cross-section dependence, but without structural breaks. Although the two papers differ in the degree of heterogeneity that is allowed for in the potential cointegration vector—the former paper considers a heterogeneous cointegration vector across the units of the panel, whereas the latter assumes homogeneity—both follow the spirit of the common correlated effects (CCE) estimator advocated in Pesaran (Citation2006) and show that the common factors that drive the cross-section dependence can be proxied through the use of cross-section averages of the variables that appear in the model. CCE-methodology offers the advantage of avoiding the need to estimate the number of common factors (m0), although m0 is required to be bounded by the number of observable stochastic processes (1+k) that appear in the model specification—the so-called rank condition, m01+k.

Westerlund and Edgerton (Citation2008) and Banerjee and Carrion-i-Silvestre (Citation2015) define panel data cointegration statistics accounting for multiple structural breaks—that can affect the deterministic component and/or the slope parameters of the model—and cross-section dependence—modeled with a common factor model. The framework of these papers allows for complete heterogeneity in the sense that all parameters are panel unit specific—that is, the break dates, the deterministic component and the slope parameters are allowed to be heterogeneous across units. The common factors are estimated using principal components as in Bai and Ng (2004) so that it is possible to estimate the number of common factors using information criteria. However, it is worth noting that Westerlund and Edgerton (Citation2008) assume that all common factors are I(0) stationary, whereas Banerjee and Carrion-i-Silvestre (Citation2015) allow for a combination of I(0)/I(1) common factors.

The contribution of the present article can be summarized as follows. First, the article shows that it is possible to obtain a consistent pooled estimate of the long-run average coefficient, which captures a statistical relationship among non-individually cointegrated variables—see Phillips and Moon (Citation1999). The model specification allows for the presence of multiple structural breaks that can affect the long-run average coefficient. This result is of special relevance from an empirical point of view, since it implies that practitioners can obtain consistent estimates of the slope parameters using a panel data pooled estimator regardless of whether the variables define a cointegration relationship. This is in direct contrast to the findings in the papers mentioned in the previous paragraph that assume heterogeneous potential cointegration vectors and estimate the parameters unit by unit, which delivers inconsistent estimators of the parameters under spurious regression. Second, the fact that the panel cointegration statistic is based on the pooled estimation of the potential cointegration vector, removes the presence of the stochastic regressors in the limiting distribution of the cointegration statistic—that is, we move from a panel data cointegration analysis framework to a panel data unit root one. This produces a spin-off contribution, since the statistic that is proposed here extends the Pesaran (2007) panel data unit root test by allowing for structural breaks in the same way as Perron (1990) and Kim and Perron (Citation2009) extended the Dickey-Fuller framework in time series analysis. Third, our setup also deals with the case where the factor loadings can be affected by structural breaks, a situation that has not been previously investigated in panel data cointegration testing analysis. This adds more flexibility and increases the interest of the model from an empirical point of view. Fourth, the common factors are estimated using the CCE approach which avoids the need to estimate the number of common factors as long as the rank condition is satisfied. While the CCE approach does not pose any problem for the computation of the pooled estimator of the slope coefficients, we show there are potentially important issues at the testing stage. This leads us to suggest a conservative testing strategy that mitigates the need to know both the number and order of integration of the common factors. Finally, the article proves that in some cases it is possible to obtain a consistent estimate of the break fraction parameters under spurious regression when the break dates are unknown. This is an interesting result per se, adding to the results obtained in Phillips and Moon (Citation1999) concerning the consistent estimation of the long-run average coefficient under spurious regression.

Taken together, our article provides a panel cointegration testing framework that explicitly models the effects of structural breaks. Westerlund (Citation2018) had argued that an approximate common factor model—estimated using the CCE approach—can result in a simple device that might capture unattended structural breaks. In principle this is an interesting feature that could act as a model-safeguarding device, avoiding the need to define a particular deterministic specification, at the risk of violating the rank condition. We argue however that if the presence of breaks is a reasonable feature of the data generation process, it would be better to undertake explicit modeling of these breaks and free the common factors to capture other potential misspecification errors and dependences that might affect the model.

The article is organized as follows. Section 2 describes the model, Section 3 defines the pooled CCE estimator of the slope parameters and deals with the estimation of the structural break dates. The consistency of the pooled CCE estimator provides the ground upon which individual and panel data cointegration statistics are proposed in Section 4. Section 5 conducts an extensive Monte Carlo simulation experiment to analyse the finite sample performance of the statistics that have been proposed in the article. An empirical illustration that focuses on house prices and per capita disposable income for the U.S. states is conducted in Section 6. Finally, Section 7 concludes. Supplementary material is collected in the appendices with the proofs, tables of critical values, and tables and figures that summarize the finite sample performance of the statistics. A GAUSS code program is available to implement the proposal that is designed in the article.

2 The model

Let Yi,t=(yi,t,xi,t) be a (1+k) -vector of I(1) stochastic processes with data generating process (DGP): (1) Yi,t=ΘiDt+πi,tFt+Ui,t(1) (2) (IL)Ft=vt(2) (3) (IL)Ui,t=ei,t,(3) where Θi is a ((1+k)×(J0+1)) matrix of coefficients with rows defined as θi,l=(θi,l,1,1,,θi,l,J0+1,1,θi,l,1,2,,θi,l,J0+1,2), l=1,,(1+k), Dt=(1,DUt,t,DTt), DUt=(DU1,t,,DUJ0,t), DTt=(DT1,t,,DTJ0,t) with DUj,t=1(t>Tj0), DTj,t=(tTj0)1(t>Tj0), 1(·) the indicator function, Tj0=λj0T the jth break date, λj0=Tj0/T the break fraction and · the integer part, j=1,,J0, with the convention that T00=0 and TJ0+10=T—that is, λ00=0 and λJ0+10=1. Throughout the article, the superscript “0” indicates the true value of the corresponding figure. The deterministic component covers two variants: (a) a constant with level shifts—henceforth, constant case—and (b) a time trend with both level and slope shifts—in what follows, time trend case. The set of generic break fraction parameters vector is defined as Λ={λ=(λ1,,λJ0)|ϵ<λ1<<λJ0<1ϵ and |λjλj+1|>ϵ}, with ϵ denoting the amount of trimming that is specified—popular choices are ϵ{0.1,0.15,0.2,0.25}. At this stage, the model set-up assumes that the number and position of the common structural breaks are known, but the procedure that can be implemented to estimate them is discussed below.

To begin with, cross-section dependence in the panel is defined by Ft which is a m0-vector of observable common factors and πi,t=πi,j, Tj10<tTj0, the ((1+k)×m0) matrix of factor loadings. Denote by K=σ(F0,,Ft,) the sigma field generated by the sequence {Ft}t=0 so that, conditionally on K, Ui,t=(Uyi,t,Uxi,t) are independent across i—see Urbain and Westerlund (Citation2011). Let M<∞ be a generic positive number, not depending on T and N, and define the Euclidean norm of a generic matrix A as ||A||=trace(AA)1/2. The vector of stochastic processes Vi,t=(vt,ei,t), i=1,,N, t=1,,T, is assumed to satisfy the following assumptions—see Bai and Ng (2004) and Banerjee and Carrion-i-Silvestre (Citation2015).

Assumption 1.

(i) vt=CjF(L)wt, wtiid(0,Σwj), E||wt||4M, Tj1<tTj, and (ii) var(ΔFt)=l=0Cj,lFΣwjCj,lF>0, Tj1<tTj, (iii) l=0l||Cj,lF||<M; and (iv) CjF(1) has rank m10 j, 0m10m0, j=1,,J0+1.

Assumption 2.

(i) For each i, ei,t=Ci,jU(L)εi,t, εi,tiid(0,σεi,j2), E||εi,t||8M, l=0l||Ci,j,lU||<M, ωi,j2=Ci,jU(1)2σεi,j2>0; (ii) E(εi,tεl,t)=τi,l with i=1N||τi,l||M for all l, Tj1<tTj; (iii) E||1Ni=1N[εi,sεi,tE(εi,sεi,t)]||4M, for every (t, s); and (iv) Ci,jU(1) is positive definite almost surely for all i and j, j=1,,J0+1.

Assumption 3.

The errors εi,t, wt and loadings πi are three mutually independent groups across i, t and (1+k) dimensions.

Assumption 4.

E||F0||M, and for every i=1,,N, E||Ui,0||M.

Assumption 5.

As Tj0 and T, Tj0/Tλj0, j=1,,J0, with 0<λ10<<λJ00<1.

Assumptions 1–5 ensure that the central limit theorem (CLT) holds for Vi,t, so that, for the most general specification that is considered in this article, we have: T1/2t=1[rT]Vi,tk=1j1[Ci,k(1)Ci,k+1(1)]Wi(λk0) +Ci,j(1)Wi(r); r[λj10,λj0],

as  T for all i, where Wi(r) is a (m0+1+k)-vector of standard Brownian motion on r[λj10,λj0], and Ωi,j=Ci,j(1)Ci,j(1)=[CjF(1)CjF(1)CjF(1)Ci,jU(1)Ci,jU(1)CjF(1)Ci,jU(1)Ci,jU(1)],is the long-run conditional covariance matrix with expected value E(Ωi,j)=Ωj, j=1,,J0+1, the so-called long-run conditional average covariance matrix—see Lemma 1 in the appendix. The matrix Ωi,j can be partitioned to define ΩUi,jUi,j=Ci,jU(1)Ci,jU(1), ΩUxi,jUxi,j=Ci,jUx(1)Ci,jUx(1) and ΩUxi,jUyi,j=Ci,jUx(1)Ci,jUy(1), j=1,,J0+1.

The model considers the case where xi,t are assumed to be either cross-section independent—imposing all, but the first, rows of πi,t to be equal to zero—or cross-section dependent with dependence driven by Ft. Furthermore, it is possible to assume that the set of observable common factors affecting the endogenous variable yi,t is different from those affecting xi,t, a situation that is covered if we define πi,t to be a block-diagonal matrix.

Despite the presence of the operator (IL) in (2), Ft can be I(0), I(1), or a combination of both, depending on the rank of CjF(1), j=1,,J0+1. Let m00 and m10 be the number of I(0) and I(1) common factors, respectively, so that m0=m00+m10. If CjF(1)=0, then Ft is I(0) and m00=m0. If CjF(1) is of full rank, then each component of Ft is I(1) and m10=m0. If CjF(1)0, but not full rank, then some components of Ft are I(1) and some are I(0). Following Bai, Kao and Ng (Citation2009) and Banerjee and Carrion-i-Silvestre (Citation2015), the inclusion of common factors in the potential long-run relationship implies a change in the standard concept of cointegration. The usual definition of cointegration among Yi,t requires Ft to be I(0), so that Yi,t captures all the common stochastic trends. However, allowing Ft to be I(1) is also relevant from an empirical point of view since Ft might be accounting for effects that are not captured by Yi,t alone. Another interesting feature is that (1) specifies time dependent factor loadings so that the structural breaks can also affect the way in which the common factors affect panel units. Finally, the DGP also allows for the possibility that the structural breaks do not affect all elements in Yi,t, and also covers intermediate situations where, for instance, the deterministic component and/or the loadings do not change across regimes for some of the elements in Yi,t.

The most general model specification that is admitted in our framework is (4) yi,t=μi,j,1y+μi,j,2y(tTj10)+xi,tβj+ηi,jyFt+ξi,ty(4) (5) xi,t=μi,j,1x+μi,j,2x(tTj10)+ηi,jxFt+ξi,tx,(5)

μi,j,k=(μi,j,ky,μi,j,kx), k{1,2}, ηi,j=(ηi,jy,ηi,jx), Tj10<tTj0, j=1,,J0+1. Combining (4) and (5) we obtain: (yi,txi,t)=(βjμi,j,1x+μi,j,1yβjμi,j,2x+μi,j,2yμi,j,1xμi,j,2x)dt+(βjηi,jx+ηi,jyηi,jx)Ft+(βjξi,tx+ξi,tyξi,tx),that is, (6) (yi,txi,t)=(μi,j,1yμi,j,2yμi,j,1xμi,j,2x)dt+(πi,tyπi,tx)Ft+(ξi,tyξi,tx),(6)

dt=(1,(tTj10)), which corresponds to (1) given that θi,1,j,k=θi,j,ky=l=1jμi,l,ky, μi,l,ky=βlμi,l,kx+μi,l,ky, θi,j,kx=l=1jμi,l,kx, k{1,2}, πi,t=(πi,ty,πi,tx), πi,ty=βjηi,jx+ηi,ly, πi,tx=ηi,jx, and Ui,t=(ξi,ty,ξi,tx), ξi,ty=l=1jβlξi,tx+ξi,ty, Tj10<tTj0, j=1,,J0+1. Although the model has been written as a pure structural change model in which all parameters change, we can impose parameter constraints in (4) to obtain partial structural change specifications. Our article distinguishes six different model specifications depending on whether the structural breaks affect the deterministic component, the slope parameters and/or the factor loadings. summarizes all the cases covered in the article. Model A assumes that the structural breaks affect only the deterministic component, Model B considers that the structural breaks affect both the deterministic component and the slope parameters and, finally, Model C permits the structural breaks to affect the deterministic component, the slope parameters and the factor loadings. For each model specification we consider the two variants of the deterministic component—designated by a 1 for the constant case and by a 2 for the time trend case. This defines Models A1, A2, B1, B2, C1, and C2.

Table 1 Model specifications.

The block-orthogonal model specification given by (4) for each panel unit is: (7) yi=Dι(λ0)μi,1y+Dτ(λ0)μi,2y+xil(λ0)β+Fl(λ0)ηiy+ξiy,(7) with l{A,B,C} denoting the model and yi and ξi being T-vectors of the dependent variable and disturbance term, respectively, for the ith panel unit. Dι(λ0)=[Dι,1  Dι,J0+1]T×(J0+1)=diag(ι1,,ιJ0+1) and Dτ(λ0)=[Dτ,1  Dτ,J0+1]T×(J0+1)=diag(τ1,,τJ0+1) are diagonal matrices defined with the elements between parentheses, ιj is a (Tj0Tj10)-vector of ones, τj=(1,2,,Tj0Tj10) is the time trend of the jth regime, and μi,ky=(μi,1,ky,,μi,J0+1,ky), k{1,2}. For Model A, xiA(λ0)=xi=[xi,1  xi,k](T×k) and β=β0 is a k-vector of parameters—subscript “0” denotes that the parameters are not affected by structural breaks. For Models B and C: (8) xil(λ0)=[xi(1) xi(2)  xi(J0+1)](T×k(J0+1));l{B,C},(8) with xi(j)=[(Dι,jxi,1)  (Dι,jxi,k)], j=1,,J0+1, where is the element-wise (Hadamard) product and β=(β1,,βJ0+1). Similarly, Fl(λ0)=F=[F1  Fm0](T×m0), l{A,B}, and ηiy=ηi,0y is a m0-row vector of loadings, whereas for Model C: (9) FC(λ0)=[F(1) F(2)  F(J0+1)](T×m0(J0+1)),(9) with F(j)=[(Dι,jF1)  (Dι,jFm0)], j=1,,J0+1, and ηiy=(ηi,1y,,ηi,J0+1y) is the m0(J0+1)-row vector of factor loadings. The estimation of β in (7) can be performed using the panel data pooled estimator, although it would require both the common factors and the structural break dates to be known. Certainly, there are some cases where the common factors can be thought to be observable as discussed in Pesaran (2015), but this situation is rarely found in practice. The same applies to the structural break dates, so that the next section designs an estimation procedure that will allow the empirical implementation of our proposal when both the common factors and the structural breaks are unknown—the known common factors and/or structural breaks situation can be seen as particular cases.

3 Unknown Common Factors and Structural Breaks

The discussion that is conducted in this section focuses on Model C2, since the other specifications are obtained as particular cases. To ease the derivations, at this stage it is assumed that the number of structural breaks J0 (but not their position) is known, an assumption that will be relaxed below. We follow Pesaran (Citation2006, 2007) and use cross-section averages to capture the unobserved common factors, which can be written in matrix notation from the block-orthogonal model specification as Y¯=D(λ)μ¯+F(λ)π¯+U¯, where D(λ)=diag([ι1,τ1],,[ιJ0+1,τJ0+1]), F(λ)=FC(λ), π¯=(π¯1,,π¯J0+1), Y¯t=N1i=1NYi,t, μ¯=N1i=1Nμi, π¯j=N1i=1Nπi,j and U¯t=N1i=1NUi,t.

Assumption 6.

(π¯j)=m0(1+k) for all N as N j, j=1,,J0+1.

If the rank condition established in Assumption 6 is met, we have: (10) F=(Y¯D(λ)π¯U¯)π¯(π¯π¯)1,(10) with U¯t=Op(T1/2N1/2) so that U¯tp0 for (i) fixed T and N, and (ii) T and N with T/N0. Then, the observable cross-section averages h¯t=(Dt,Dι,1,tY¯t,,Dι,J0+1,tY¯t) can be used to proxy the unobserved factors. Adding and subtracting (10) in (7) we have: yi=Dι(λ)μi,1y+Dτ(λ)μi,2y+xi(λ)β+F(λ)πiy ±(Y¯D(λ)π¯U¯)π¯(π¯π¯)1πiy+ξiy (11) =Dι(λ)μi,1y+Dτ(λ)μi,2y+xi(λ)β+Y¯δiy+υi,(11) with δiy=π¯(π¯π¯)1πiy and υi=ξiy+(F(λ)Y¯π¯(π¯π¯)1)πiy. This defines an extension of the so-called cross-section augmented regression model in Holly, Pesaran, and Yamagata (Citation2010) and Banerjee and Carrion-i-Silvestre (2017): (12) yi,t=μi,j,1y+μi,j,2y(tTj1)+xi,tβj+Y¯tδi,jy+υi,t,(12)

Tj1<tTj, j=1,,J0+1, where TB=(T1,,TJ0) is a vector of generic break dates. Note that it is implicitly assumed that m0=k+1 since all k + 1 cross-section averages are included in (12). Let us define the projection matrix M¯(λ)=IH¯(λ)(H¯(λ)H¯(λ))1H¯(λ), with H¯(λ)=[D(λ) z¯(λ)], z¯(λ)=[z¯(1)  z¯(J0+1)](T×(k+1)(J0+1)), and z¯(j)=[(Dι,jx¯1)  (Dι,jx¯k) (Dι,jy¯)], j=1,,J0+1. Then, the estimation of β in (11) can be done using the pooled CCE estimator (PCCE): (13) β̂PCCE(λ)=(i=1Nxi(λ)M¯(λ)xi(λ))1(i=1Nxi(λ)M¯(λ)yi).(13)

Following Bai (Citation2010), Kim (Citation2011, 2014), and Baltagi, Feng, and Kao (Citation2016, 2019), the estimation of the break dates is carried out minimizing the global sum of squared residuals (SSR), so that: (14) T̂B=argminλΛSSR(λ),(14) where T̂B=(T̂1,,T̂J0), λ̂=T̂B/T, SSR(λ)=i=1NSSRi(λ), SSRi(λ) being the SSR for the ith panel unit associated with the estimation of (12). Thus, the estimation of the break dates can be obtained carrying out a grid search over all potential combinations of J0 structural breaks.Footnote1 The following theorem analyzes the consistency of β̂PCCE(λ̂) and λ̂. For completeness, we also include the known breaks case.

Theorem 1.

Let Yi,t be a vector of (1+k) stochastic processes with DGP given by (1)–(3) and satisfying Assumptions 1–6. Then, as (T,N):

  1. Known break dates case. For Models A, B, and C, β̂PCCE(λ0)pβ=ΩUxUx1ΩUxUy.

  2. Unknown break dates case:

    1. Model A1: (λ̂λ0) does not converge to zero in probability.

    2. Models A2, B1, B2, C1, and C2: (i) (λ̂λ0)p0, with (λ̂λ0)=Op(T1/2) for Models A2, B2 and C2, (λ̂λ0)=op(N1/2) for Models B1 and C1, and (ii) β̂PCCE(λ̂)pβ=ΩUxUx1ΩUxUy.

See the companion appendix for the proof. Theorem 1 establishes that λ̂ is not consistent for Model A1 since the effects of the structural changes are dominated, in the limit, by the stochastic trend components,Footnote2 whereas it is consistent for the other models. In addition, it is shown that the PCCE estimator converges toward the long-run average coefficient β in the known breaks case for all models, and in the unknown breaks case for the models for which a consistent break fraction vector estimate can be obtained. This establishes the basis for the derivation of our panel data cointegration test. Finally, it is worth noting that Theorem 1 establishes the consistency of the break fraction estimates for Models A2, B1, B2, C1, and C2. As for the break dates estimates, the results in Theorem 1 show that it is not possible to obtain consistent estimates of the break dates for Models A2, B2, and C2, consistency only applies to break fraction estimates. For Models B1 and C1, consistent break dates can be obtained if TN1/20 is assumed. Although there are some contributions in the panel literature that derive consistent break dates estimation procedures—see Bai (Citation2010), Kim (Citation2011, 2014) and Baltagi, Feng and Kao (Citation2016, 2019), among others—these approaches assume that the disturbance term of the model is I(0), whereas here we deal with the novel panel data spurious regression with structural breaks case. Consequently, the framework under which those results are obtained is not comparable to our panel data spurious regression set-up.

It is possible to define an alternative estimation of the break dates that relies on the minimization of the SSR of the model: (15) yi,t=μi,j,1y+μi,j,2y(tTj1)+xi,tβ̂j+υi,t,(15) instead of using (12), where β̂j refers to the jth regime parameters in β̂PCCE(λ) for a generic TB. The alternative estimation of the break dates is then defined as: (16) T˜B=argminλΛSSR(λ),(16) where T˜B=(T˜1,,T˜J0), λ˜=T˜B/T, SSR(λ)=i=1NSSRi(λ), SSRi(λ) being the SSR for the ith panel unit associated with the estimation of (15). It is worth noting that this procedure ignores the common factors in (15), although they are considered in the estimation of β̂PCCE(λ). Therefore, this estimator can be seen as an hybrid method in which the common factors are used in the estimation of β̂PCCE(λ), but not in the minimization of the SSR of (15). The following theorem summarizes the main features of this alternative estimator.

Theorem 2.

Let Yi,t be a vector of (1+k) stochastic processes with DGP given by (1)–(3) and satisfying Assumptions 1–6. Then, as (T,N), (λ˜λ0)=op(N1/4T1/2) for Models B1 and C1, and (λ˜λ0)=op(N1/3T1) for Models A2, B2, and C2.

The proof is given in the companion appendix. As can be seen, the rate of convergence of λ˜ is faster than the λ̂ one for a given model, which suggests that better performance of the former in finite samples are to be expected. This result is due to the fact that the alternative estimator avoids projecting the observable variables in the model against the common factors, which leads us to obtain a stronger signal around the combination of break dates that minimizes the SSR of (15)—see the appendix. Finally, consistent break dates can be obtained for Models A2, B2, and C2—although at a slow rate of convergence—and for Models B1 and C1 if it is assumed that TN1/20. As mentioned above, the existing panel data structural break estimation techniques that have been recently proposed in the literature assume that the disturbance term of the model is I(0), which leads to consistent estimates of the break dates with higher rates of convergence.

The estimation of the number of structural breaks can be carried out using panel information criteria similar to the ones proposed in Bai and Ng (Citation2002, 2004). To be specific, let us define an information criteria of the form: IC(λ̂,J)=lnσ̂2(λ̂,J)+((1+J)k)g(N,T),with σ̂2(λ̂,J)=N1T1i=1Nt=1TΔυ̂i,t2, J=0,1,,Jmax, where υ̂i,t denotes the estimated residuals from (12) and g(N,T) is the penalty function—one possibility suggested in Bai and Ng (Citation2002) is the panel BIC that is established with g(N,T)=ln(NT/(N+T))(N+T)/NT. Then, Ĵ=argminJ=0,,JmaxIC(λ̂,J). Similarly, we can define J˜ if λ˜ is used instead of λ̂ in the estimation of (12). The following theorem shows that the suggested information criterion provides consistent estimation of the number of structural breaks.

Theorem 3.

Let Yi,t be a vector of (1+k) stochastic processes with DGP given by (1)–(3) and satisfying Assumptions 1–6. Then, limN,TPr(J°=J0)=1, J°{Ĵ,J˜}, if (i) g(N,T)0 and (ii) CNT2 g(N,T) as (T,N), where CNT=min{N,T}.

The proof is outlined in the appendix. This result defines a framework general enough to be able to treat all the elements of the model in either an endogenous or exogenous way.

4 PCCE-based Cointegration Test Statistic

Following Pesaran (2007), Holly, Pesaran and Yamagata (Citation2010), and Banerjee and Carrion-i-Silvestre (2017), we design a cointegration statistic that approximates the unobserved common factors using cross-section averages of the observable variables. The simplicity of this approach comes at the cost of restricting the framework defined above since Pesaran (2007) and Pesaran, Smith and Yamagata (Citation2013) assume that both the common factor and the idiosyncratic components are I(1) under the null hypothesis of unit root—that is, m0=m10. In what follows, this section focuses on the most general specification given by Model C2 using λ̂, although the procedure is valid for the other model specifications and the λ˜ estimator.

Since β̂PCCE(λ̂) is a consistent estimate of β—for all models except A1—we define ŷi,t=yi,txi,tβ̂PCCE(λ̂) and base the testing of the null hypothesis of no cointegration for each unit on the cross-section augmented ADF (CADF) type regression equation:Footnote3 (17) Δŷi,t=j=0Ĵθi,jDUj,t+j=1Ĵγi,jD(T̂j)t+j=0Ĵϑi,jDTj,t+αi,0ŷi,t1+l=1piαi,lΔŷi,tl+j=0Ĵφi,j(DUjA¯)t1+j=0Ĵl=0piκi,j,l(DUjΔA¯)tl+νi,t,(17) with D(T̂j)t=1 for t=T̂j+1, 0 otherwise, j=1,,Ĵ. Note that the order of augmentation pi in (17) is heterogeneous, which can be selected using the modified information criteria in Ng and Perron (Citation2001)—homogeneous pi as in Pesaran (2007) can be imposed, if desired. The pseudo t-ratio statistic of α̂i,0 in (17), tα̂i,0(λ̂), is used to test the null hypothesis of no cointegration.Footnote4 The number of common factors that is assumed defines A¯t, so that A¯t=ŷ¯t for m0=1, whereas A¯t=(ŷ¯t,x¯1,t,,x¯k,t) for m0=1+k. For the intermediate cases where m0<1+k, A¯t will be defined with ŷ¯t and m01 elements of x¯t. The following theorem provides the limiting distribution of tα̂i,0(λ̂) for both the known and unknown breaks cases.

Theorem 4.

Let Yi,t be a vector of (1+k) stochastic processes with DGP given by (1)–(3) and satisfying Assumptions 1–6. The tα̂i,0(λ°) statistic, λ°{λ̂,λ˜}, converges sequentially (if N first, then T, that is, (N,T)seq) and jointly (if (N,T)j with T/N0, that is, (N,T)j) to: tα̂i,0(λ°)01Wi(r)dWi(r)ωiFl(λ0)GFl(λ0)1πiFl(λ0)(01Wi2(r)drπiFl(λ0)GFl(λ0)1πiFl(λ0))1/2 R(λ0,l),where Wi(r) denotes a scalar standard Brownian motion, ωiFl(λ0), GFl(λ0) and πiFl(λ0) are functions of a m0-vector of standard Brownian motions associated with the common factors that are defined in the appendix, with l{A1,A2,B1,B2,C1,C2} for the known breaks case (λ̂=λ˜=λ0) and, for the unknown breaks case, l{B1,C1} when λ̂ is used and l{A2,B1,B2,C1,C2} when λ˜ is applied.

The proof is given in the appendix and some remarks are in order. First, the limiting distribution of tα̂i,0(λ̂) and tα̂i,0(λ˜) depends on the number and position of structural breaks (λ0) and on the number of I(1) nonstationary common factors—note that it is assumed that m0=m10. If analysts have knowledge—for example, based on economic theory—about the number of common factors, one can consider such information when A¯t is defined. In this regard, we could follow the strategy in Pesaran et al. (Citation2013) and compute the statistic using all possible combinations of  m0 cross-section averages available in the system as a way of obtaining robust conclusions. When m0 is unknown, we can follow a conservative strategy and assume that the rank condition is satisfied with equality. The price that we would pay if m0<1+k, but we impose m0=1+k, is to have a test statistic with empirical size smaller than the nominal size accompanied by loss of power. The advantage is to allow us to remain agnostic about the number of integrated stochastic trends driving the data.Footnote5 Second, the appendix shows that the limiting distribution of tα̂i,0(λ0) for Models A and B is the same, for each variant of the deterministic component, whereas it is different for Model C.

Third, for the unknown breaks case and Models B1 and C1, tα̂i,0(λ̂) converges to the same limiting distribution as for the known structural breaks case. Unfortunately, this does not occur for Models A2, B2, and C2. In these cases, we suggest the implementation of the modified procedure in Kim and Perron (Citation2009) that relies on the use of trimmed data. In brief, the idea is based on establishing a window of observations around a consistent estimate of λ—that is, (λ̂λ0)=Op(Ta) for some 0<a<1—of length 2ω(T) with ω(T)κTδ, κ>0 and 1<a<δ<0. The window defines the set of observations compressed between T̂l+1T(λ̂ω(T))+1 and T̂hT(λ̂+ω(T)) and has the characteristics of: (i) increasing slowly enough to be asymptotically negligible relative to T and, (ii) increasing fast enough to include the true structural break dates since T̂lT̂B=T(λ̂ω(T))Tλ0=(TaδTa(λ̂λ0)κ)Tδ+1p and T̂BT̂h=Tλ0T(λ̂+ω(T))=(TaδTa(λ0λ̂)κ)Tδ+1p. Once T̂l and T̂h have been specified, Kim and Perron (Citation2009) define a new dataset as:Footnote6 ŷi,tn=ŷi,t+T̂j1,hT̂j1,lS(ŷi,t,j1) for T̂j1<tT̂j,with S(ŷi,t,j1)=ŷi,T̂j1,hŷi,T̂j1,l, T̂0,h=T̂0,l=0 and S(ŷi,t,0)=0—the same transformation applies to obtain A¯tn with S(A¯t,j1)=A¯T̂j1,hA¯T̂j1,l. The new dataset removes the data points between T̂l+1 and T̂h and reconnects the remaining data shifting down the data after the window with the S(·,j1) function. Then, the pseudo t-ratio statistic of α̂i,0 in (17) is computed using the trimmed data ŷi,tn and A¯tn with the break dates given by T̂l. The resulting statistic is denoted as tα̂i,0(λ̂tr), which limiting distribution is given in the following corollary.

Corollary 1.

Let Yi,t be a vector of (1+k) stochastic processes with DGP given by (1)–(3) and satisfying Assumptions 1–6. For the unknown structural breaks case and as either (N,T)seq or (N,T)j, tα̂i,0(λ̂tr)R(λ0,l), l{A2,B2,C2}.

The proof derives from the developments in Theorem 4. As can be seen, the limiting distribution of tα̂i,0(λ̂tr) equals the limiting distribution that is obtained for the known structural breaks case. Finally, it should be stressed that the limiting distributions in Theorem 4 and Corollary 1 also correspond to the distributions obtained for a panel unit root test with multiple structural breaks that affect the level and/or the slope of the time trend. Therefore, as side contributions, this article (i) extends the panel data unit root tests in Pesaran (2007) and Pesaran, Smith and Yamagata (Citation2013) to the multiple structural breaks case—affecting the deterministic component and/or the factor loadings—and (ii) provides an extension of Perron (Citation1989, 1990) and Kim and Perron (Citation2009) unit root tests to panel data framework.

The limiting distributions in Theorem 4 are approximated by Monte Carlo simulation. A limited set of critical values for tα̂i,0(λ0) are reported in Tables B.1 to B.6 for Models A1, B1, and C1, and in Tables B.7 to B.12 for Models A2, B2, and C2, with J = 1 and m0k+1{2,3,4} common factors. Pesaran (2007) and Pesaran, Smith and Yamagata (Citation2013) also propose a truncated version of the tα̂i,0(λ0) statistic in order to ensure that the statistic has finite moments, although these papers provide evidence that for T > 15 the empirical distributions of the truncated and untruncated statistics are equivalent. Therefore, in this article we do not consider the truncated version of tα̂i,0(λ0).

The combination of the individual test statistics defines the cross-section augmented ADF panel cointegration statistics CIPS(λ°)=N1i=1Ntα̂i,0(λ°), λ°{λ̂,λ˜,λ̂tr}, as proposed in Pesaran (2007). This statistic allows us to test the null hypothesis of no panel data cointegration against the alternative that there is a fraction of panel units for which cointegration holds. The critical values for the CIPS(λ0) statistic are presented in Tables B.13 to B.18 for Models A1, B1, and C1, and in Tables B.19–B.24 for Models A2, B2, and C2, with J = 1 and m0k+1{2,3,4}. Interestingly, the critical values seem to be symmetrical around λ0=0.5 for large T, a characteristic that is more evident for the panel data cointegration statistic critical values. This feature has also been found for the ADF unit root test with one structural break proposed in Perron (Citation1989, 1990).Footnote7

Westerlund, Hosseinkouchack and Solberger (Citation2016) derive the local-to-unity asymptotic power functions of Pesaran (2007) type-test statistics, and show that these statistics have non-negligible power in a neighborhood of T1. This is a distinctive feature of these statistics when compared to other proposals in the literature, that have been shown to have non-negligible power in the neighborhood given by NκT1, κ>0—see, for instance, Moon and Perron (Citation2008). This implies that as N increases Pesaran (2007) type-test statistics will tend to be dominated by other panel data tests having non-negligible local power for κ>0.

5 Finite Sample Performance

Let us consider the DGP defined by: (18) yi,t=θi,1DUt+θi,2DTt+xi,tς0+DUtxi,tς1+Ftφi,0y+DUtFtφi,1y+ui,tΔxi,t=ΔFtφi,0x+Δ(DUtFt)φi,1x+υi,t;Fj,t=ρFj,t1+wj,t;ui,t=ϕiui,t1+εi,t,(18) where θi,1N(10,1), θi,2N(0.3,1), ς0=1, ς1{1,5}, φi,0yU[0,1], φi,1yU[0,2], φi,0xU[0,1], φi,1xU[0,3], υi,tN(0,1), wj,tN(0,1), j{1,2}, and εi,tN(0,1) are mutually independent groups. Under the null hypothesis of no cointegration ϕi=1 i, whereas under the alternative hypothesis of cointegration we set ϕi=0.9 i. This section investigates the performance of the proposed statistics considering one unknown structural break—additional simulation results for known structural breaks are available upon request. We distinguish three different cases of interest. Case 1 specifies a DGP in which both the deterministic component and the cointegration vector change, but the loadings remain constant across regimes—that is, ς1{1,5} and φi,1y=φi,1x=0 i in (18). Case 2 uses the Case 1 DGP but the computations are carried out allowing for a nonexistent change in factor loadings. This is done to proxy situations where the investigator may be uncertain about whether there are changes in the factor loadings and wishes to protect herself (in terms of size of the test) against this possibility. Finally, Case 3 deals with the pure structural break case where all parameters (deterministic component, cointegration vector and factor loadings) change across regimes. Note that for each of these cases we consider the two variants of the deterministic component—that is, the constant and the time trend cases. The break date is specified at λ0=0.5 and we consider up to two common factors, m0{1,2}, with ρ{1,0.99,0.95,0.9}. The time dimension is set at T{50,100,200} and the cross-section dimension is N{20,50}. The nominal size is set at 5% and the critical values tabulated in the previous section are used.

5.1 Constant Case

5.1.1 Performance of the Break Fraction and Pooled Estimators

This section investigates the performance of λ̂ and λ˜ for Models B and C with the deterministic component given by the constant case—that is, θi,2=0 i in (18). Let us first focus on the λ̂ estimator. Figures C.1–C.4 present histograms of λ̂ for Model B1 (Case 1) when ς1=1 for all possible combinations that consider that the true number of common factors is m0{1,2} and the imposed number of common factors is m{1,2}. It is worth stressing that consistency of the λ̂ and λ˜ estimators has been established under the null hypothesis of spurious regression situation, but for completeness we also include the histograms for these estimators under cointegration.

In general and regardless of the order of integration of the common factors, histograms are concentrated around λ0 with higher probability as either T or N increases. It is interesting to note that allowing for m>m0 does not affect the performance of λ̂, whereas the probability mass of λ̂ around λ0 increases with m0. As expected, as the magnitude of the structural change that affect the slope parameters increases, the performance of λ̂ shows a substantial improvement, with almost all probability mass located on λ0, regardless of m0, m and the order of integration of the idiosyncratic and the common factor components—see Figures C.5–C.8 with the histograms of λ̂ for Model B1 when ς1=5. These features are also found when the results for Cases 2 and 3 are analyzed—see Figures C.9–C.24. The use of the λ˜ estimator produces better results since now, and even for ς1=1, the probability mass of λ˜ around λ0 is higher than the λ̂ one—see Figures C.25–C.48 that depict the histograms of λ˜ for Cases 1 to 3, m{1,2}, m0{1,2} and ς1{1,5}. This simulation evidence is demonstration of the statement made in Theorems 1 and 2.

Finally, we have also analyzed the performance of β̂PCCE(λ̂) and β̂PCCE(λ˜) and found evidence that supports that its application leads to a consistent estimation of β. To be specific, the mean squared error (MSE) decreases as N and/or T increase, and as ρ moves away from one—detailed results are available upon request.

5.1.2 Empirical Size and Power of the Panel Cointegration Statistic

Table C.1 summarizes the performance of the CIPS(λ̂) statistic. When the magnitude of the structural change affecting the slope is ς1=1, we can observe size distortion problems for the panel cointegration statistic for Case 1 when the order of integration of both the idiosyncratic and the common factor components is the same, regardless of the (true and assumed) number of common factors. These distortions reduce as the model specification becomes more flexible—that is, Cases 2 and 3—and even disappear in some situations—see the results for m0=m=1. As the magnitude of the structural break increases to ς1=5, the empirical size tends to the nominal one in all cases. It is worth highlighting that the misspecification error of either under-specifying (0<m<m0) or over-specifying (m0<m) the number of common factors does not cause high size distortion problems. The panel cointegration CIPS(λ̂) statistic becomes conservative under the null hypothesis of spurious regression if the common factor is I(0). This is a consequence of the violation of the key assumption underlying the design of the statistic—see Section 4. The empirical power of CIPS(λ̂) increases with N and T, and as ρ moves away from one. As expected, the more flexible model specifications—either because the type of model is more general (Case 2) or because the number of common factors is over-specified (m0<m)—show lower empirical power values.

The performance of the statistical inference improves when λ˜ is used. Table C.2 evidences a clear enhancement of the empirical size figures when ς1=1—the exception is found for Case 3 with m0=m=2 and T = 200, although the size distortion almost disappears for ς1=5. However and contrary to what has been found for CIPS(λ̂), the under-specification of the number of common factors causes mild over-size distortions for Model C—see the results for Case 3. This results is somehow to be expected since we are missing the presence of common factors in the model specification. As above, CIPS(λ˜) becomes conservative if the assumption of common order of integration of the idiosyncratic and common components under the null hypothesis of spurious regression is not satisfied. The CIPS(λ˜) statistic encompasses CIPS(λ̂) in term of empirical power, since the former statistic shows similar, if not higher, empirical power values—note that the higher power shown by CIPS(λ̂) when ς1=1 are due to the mild size distortions that have been documented.

Table 2 Parameter estimates and CIPS panel cointegration test statistic.

To the best of our knowledge, there are no other proposals in the literature that can be used to establish a direct comparison with the test statistics that are proposed in this article. The closest proposal might the PANIC-based panel cointegration test statistic that is developed by Banerjee and Carrion-i-Silvestre (Citation2015). That paper designed a panel data statistic to test the joint null hypothesis of spurious regression without structural breaks against the alternative hypothesis of panel cointegration with structural breaks for large T compared to N. As mentioned in the introduction, the cross-section dependence was introduced in the specification using an approximate common factor model. Tables C.3 and C.4 present the empirical size and power for the panel ADF statistic that is computed for the estimated idiosyncratic component (Zc) and the MQcc statistic to estimate the number of I(0) and I(1) common stochastic trends. The simulation results are reported for Cases 1 and 3, considering m0{1,2} and allowing for up to mmax=6 common factors—the number of common factors is estimated using the panel BIC statistic in Bai and Ng (2004). The Zc statistic shows important size distortions, the higher the value of ς1, which in fact can be interpreted in terms of empirical power since this situation is a mere consequence of the violation of the DGP that is assumed under null hypothesis that is assumed in this proposal. Thus, note that the DGP defined in (18) allows for the presence of structural breaks under the null hypothesis, so that the joint hypothesis of spurious regression without structural breaks that is assumed by Banerjee and Carrion-i-Silvestre (Citation2015) is not fulfilled. The behavior of the MQcc statistic is also affected by this feature, although it is fair to note that the MQcc statistic tries to capture that misspecification error detecting more common factors than really exist—in general, it can be shown that the estimated number of common factors is double the true number.

Finally and following the suggestion in Westerlund (Citation2018), we have also studied the performance of the CIPS panel cointegration test statistic when the structural breaks are ignored. Table C.5 summarizes the empirical size and power of the CIPS proposed in Banerjee and Carrion-i-Silvestre (2017) for Cases 1 and 3, and ς1{1,5}. As can be seen, the CIPS presents size distortions, the higher the value of ς1. For ς1=1 and m0=1, the size distortions can be reduced if we over-specify the number of common factors, although this desirable feature disappears as ς1 increases. The violation of the assumption of common order of integration of the idiosyncratic and common components under the null hypothesis of spurious regression leads to a conservative test statistic, but only when ς1=1 and m>m0. For the other cases, the statistic shows over-rejection distortions. These results are in line with the literature that shows that the null hypothesis of panel unit root is rejected when cross-section dependence is ignored—see Banerjee, Marcellino and Osbat (Citation2005). Here the cross-section dependence is considered, but to some extent the omission of common structural breaks might leave one source of cross-section dependence that links the panel units uncaptured. This shows that modelling the presence of structural breaks is required if misleading statistical inference is to be avoided.

5.1.3 Estimation of the Number of Structural Breaks

Table C.10 provides the frequency of the estimated number of structural breaks that derives from the use of the panel BIC statistic, considering up to two structural breaks. The ability of the panel BIC to select the correct number of structural breaks improves as N, T and ς1 increase, with the statistic that is based on λ˜ the one that produces better results.

Let us first focus on the results for ς1=1. Under the spurious regression case with ρ = 1, J(λ̂) tends to under-estimate J for small N and T, with correct number of structural breaks detection frequencies that range between 0.26 and 0.48 for Model B1, and between 0.48 and 0.6 for Model C1. As can be seen, J(λ˜) outperforms J(λ̂) with correct frequencies in the range 0.52-0.86 for Model B1, and 0.57–0.7 for Model C1. Further, note that part of the under-estimation frequency of J(λ̂) moves to over-estimation as N increases for a given T—this feature is hardly observed for J(λ˜). When ρ=0.9, J(λ̂) still shows a tendency to under-estimate J for Model B1, although this behavior is attenuated for Model C1. As for the J(λ˜) estimator, the probability of correct detection remains unchanged as N increases, regardless of T, with a tendency of reallocate part of the under-estimation frequency on the over-estimation one as T increases. Under the cointegration scenario, the behavior of both J(λ˜) and J(λ̂) statistics improve with correct J estimation frequencies that tend to one as T increases, and with J(λ˜) outperforming J(λ̂) in all cases. Finally, when the magnitude of the structural break increases to ς1=5 both estimators provide good results.

Table C.12 summarizes the performance of the panel BIC when there are no structural breaks (J0=0) affecting the parameters of the model. The frequency of correct selection of the number of breaks for both J(λ˜) and J(λ̂) statistics is similar, and higher than the one obtained for the one structural break case. In general, higher correct selection frequencies are obtained for Model C1. This results indicates that the proposed panel BIC is useful to detect the presence of structural breaks in our model set-up.

5.2 Time Trend Case

5.2.1 Performance of the Break Fraction and Pooled Estimators

This section investigates the performance of λ̂ and λ˜ for Models B and C with the deterministic component given by the time trend case—that is, θi,1θi,20 i in (18). Figures C.49–C.52 depict the histograms of λ̂ for Model B2 (Case 1) when ς1=1 for all possible combinations that consider that the true number of common factors is m0{1,2} and the imposed number of common factors is m{1,2}. In general, we can observe that the probability mass concentrates around λ0 as T and/or N increase for a given set of ϕi and ρ parameters. The probability around λ0 is slightly reduced for Cases 2 and 3, especially for small T, although it tends toward one as T and N get large—see Figures C.53–C.60. Note that when this estimation bias appears, the estimates are located around λ0. As for the constant case, the use of λ˜ produces better results, since the probability mass of λ˜ around λ0 is, in general, similar or larger than the λ̂ one—see Figures C.61–C.72.

The simulation experiment has also studied the performance of β̂PCCE(λ̂) and has found that the MSE decreases as N and/or T increase, and as ρ moves away from one—detailed results are available upon request. All these results reinforce the theoretical analysis that has been detailed in Theorem 1.

5.2.2 Empirical Size and Power of the Panel Cointegration Statistic

Table C.6 collects the empirical size and power of CIPS(λ̂tr). The statistic shows good performance, with an empirical size that is close to 0.05 when ϕi=ρ=1. Under-specification of the number of common factors does not seem to cause major size distortions for Case 1, although this is not the case for Case 3 and, to a lesser extent, for Case 2, for large N and T. As for the constant case, the statistic becomes conservative if the assumption of common order of integration of the idiosyncratic and common factor components is not met. The empirical power increases as T and N increase, and decreases as the model specification gets more complicated—this is something to be expected due to the presence of a higher number of parameters to be estimated. For completeness, we have also included the results that are based on the badly-behaved CIPS(λ̂) statistic, which behaves similar to CIPS(λ̂tr). This result might be surprising, but we need to take into account that the term that invalidates the use of CIPS(λ̂) in the limit also depends on the magnitude of the change in the slope of the time trend, so that small values of ||θi,2|| might have little effect on the limiting distribution.

We have also analyzed the performance of the statistics under local-to-zero breaks through the definition of shrinking break magnitudes for all parameters related to the structural instability. To be specific, θi,1, θi,2, ς1, φi,1y, and φi,1x defined above have been rescaled by T1/2δ, δ>0—we have used an arbitrarily small positive value of d = 0.01—so that in the limit the effects of the structural break tend to zero.Footnote8 Table C.8 indicates that the performance of the CIPS(λ̂tr) statistic under shrinking breaks is similar to the fixed breaks one—the empirical size is close to the nominal one, whereas in some cases a mild drop in the empirical power is observed, although the empirical power equals one as T gets large.

The finite sample performance of CIPS(λ˜) is investigated in Table C.7, which reveals a test statistic with an empirical size that is close to 0.05 in most cases—with the exception of a mild over-rejection distortion found for Case 3 when m0=m=2 with large T. The values for the empirical power of CIPS(λ˜) are higher than the CIPS(λ̂tr) ones—and similar to the CIPS(λ̂) ones—for a given set of ρ, T, and N values. As for the shrinking breaks configuration, we observe that the empirical size of CIPS(λ˜) is close to 0.05, although the empirical power is reduced when compared to the results that are based on fixed break magnitudes. To some extent, this is something to be expected since in this case, although λ˜ is consistent, the rate at which λ˜ tends to λ0 is reduced—it has been shown that (λ˜λ0)=op(N1/3T1) for the fixed breaks case. From this point of view, it seems that CIPS(λ̂tr) outperforms CIPS(λ˜), although the empirical power equals one as T gets large.

Contrary to what has been done for the constant case, here we cannot compare the performance of the statistic with other existing proposal in the literature, since the case of panel cointegration testing with unknown structural changes that affect the slope of the time trend have not been previously addressed in the literature—Banerjee and Carrion-i-Silvestre (Citation2015) only deal with the unknown breaks date situation for the constant case. However, we have investigated the properties of the statistical inference when the structural breaks are ignored with the computation of the panel cointegration statistic in Banerjee and Carrion-i-Silvestre (2017), which results are summarized in Table C.9. Unfortunately, in this case the empirical size is not controlled since, for instance, for Case 1 with m0=m=1, the empirical size tends toward one as T and N increase. The size distortion is less noticeable for Case 3 with m0=m=1, although it is still important in some cases—see results for m0=m=2 and T = 200. In addition, the violation of the assumption of common order of integration of the idiosyncratic and common components under the null hypothesis of spurious regression does not seem to lead to a conservative test statistic. Finally, even in those cases where the size distortions are not very large—that is, Case 1 with m0=m=1, N = 20 and T = 50—the empirical power of the statistic is only slightly above the empirical size—that is, 0.07 versus 0.09. Again, this suggests that empirical analyses should include parameter instabilities in the model specification if there is evidence that structural changes might have affected the model.

5.2.3 Estimation of the Number of Structural Breaks

Table C.11 provides the frequency of the estimated number of structural breaks that derives from the use of the panel BIC statistic, considering up to two structural breaks. The ability of the panel BIC to select the correct number of structural breaks is very good, with a correct detection frequency that range between 0.92 and 1 regardless of N, T, ϕi, and ρ, for both λ̂ and λ˜ based estimators, and model specifications. Table C.12 presents the frequency of the number of structural breaks estimation for model specifications that do not include structural breaks (J0=0). For Model B2 the frequency of correct classification lies in the ranges [0.73, 0.92] for J(λ̂) and [0.69, 0.92] for J(λ˜) under the null hypothesis of spurious regression. For Model C2 the classification ranges narrow to [0.93, 0.96] for J(λ̂) and [0.90, 0.97] for J(λ˜) under the null hypothesis of spurious regression. Under the alternative hypothesis of cointegration, the frequency of correct estimation of the number of structural breaks equals one for both J(λ̂) and J(λ˜) statistics. This leads us to suggest the use of the panel BIC statistic in empirical applications.

6 Empirical Illustration

A well-functioning housing market has been shown to be very relevant for the proper evolution of credit and financial markets, which in turn has effects on macroeconomic variables such as output, fiscal deficit and unemployment. The empirical evidence that analyses the potential relationship between housing prices and real disposable income per capita is mixed, and depends both on the scope (national or regional) and on the period of analysis. Following Holly, Pesaran and Yamagata (Citation2010), we focus on the US economy considering the 48 contiguous U.S. States and the District of Columbia (N = 49) using annual data between 1975 to 2019 (T = 45)—see Holly, Pesaran and Yamagata (Citation2010) for the sources of the statistical information—with the model: (19) hpi,t=αi+βyi,t+ui,t,(19) where hpi,t denotes the logarithm of the real housing prices index and yi,t is the logarithm of the real disposable income per capita. They argue that the use of panel data cointegration analysis can provide better statistical inference given the short time period of the available information. The preliminary analysis conducted in their paper reveals that cross-section dependence is present in the dataset.

As discussed in Holly, Pesaran and Yamagata (Citation2010), the boom house prices started in early 2000 in the United States and accelerated during 2003–2006, something that has been interpreted in the literature as a housing price bubble. Figures C.73 and C.74 depict the variables, which show hump-shaped behavior in some of the house price time series during mid-eighties and in the middle of the first decade of 2000. If this is the case, it is safe to assume that the potential long-run relationship between housing prices and disposable income per capita might have been affected. This in turn implies that parameter instabilities should be accounted for in the model. The empirical specification that is used here generalizes the one in (19) by considering the effects of structural breaks as follows: (20) hpi,t=μi,j,1+βjyi,t+Y¯tδi,jy+υi,tTj1<tTj,(20) with Y¯t=(hp¯t,y¯t), hp¯t=N1i=1Nhpi,t, y¯t=N1i=1Nyi,t and j=1,,J.

summarizes the results that are based on both break dates estimation procedures discussed in Section 3, although it has to be born in mind that T˜B clearly outperforms T̂B, so that more weight should be imposed on the conclusions drawn from the former estimator. The order of augmentation pi in (17) is selected using the modified Akaike’s information criterion (MAIC) in Ng and Perron (Citation2001) and Perron and Qu (Citation2007) for each panel unit with up to five lags, and the trimming is set at ϵ=0.2. Let us first focus on the specification that considers one structural break. As can be seen, there is strong evidence of panel cointegration for Model B1, since the null hypothesis of spurious regression is rejected at the 5% significance level regardless of m. Evidence of cointegration is also found for Model C1, although only when m = 2. This situation illustrates the potential effect on the empirical power of CIPS when m<m0, which can be reduced if a nonstationary common factor is not accounted for. Although the estimated break date—T̂B=1984 and T˜B=1985 for Model B1, and T̂B=1991 and T˜B=1986 for Model C1—is not located around the 2003–2006 housing price bubble period mentioned above, it does reflect important changes that experience the U.S. real state market—that is, the deregulations of the mortgage market, the development of a secondary mortgage market and the increasing role of the government-sponsored enterprises that affected the U.S. economy during the first half of the eighties; see Gerardi, Harvey and Willen (Citation2010) and Ahamada and Diaz-Sanchez (Citation2013). It is interesting to note that all estimated elasticities are below one, with the major change observed for Model C1 with T˜B.

The conclusions that are obtained for the two structural breaks depend on the break estimation procedure. Evidence of panel cointegration is quite weak when using T̂B—that is, the null hypothesis of spurious regression is only rejected for Model C1 with m = 1—and is stronger when using T˜B—panel cointegration is found except for Model C1 with m = 2. It is worth noting that for the latter, the first estimated break date detects the policy changes that experienced the U.S. housing market during the eighties, whereas the second break date is close to the global financial crisis (Model B1) or to the housing prices bubble discussed above (Model C1). As for the parameter estimates, income elasticity shows a mild (large) increase from the first to the second regimes Model B1 (Model C1), to experience a decrease from the second to the third regimes for both models.

Finally, we have also reported the value of the panel BIC statistic that has been proposed in the article as a way to estimate the number of structural breaks. The model specification that minimizes the panel BIC is the one given by Model C1 with J = 1 and T˜B. This result reinforces the analysis the overall discussion about the existence of a long-run relationship between the logarithm of the real housing prices index and the logarithm of the real disposable income per capita for the U.S. States, a conclusion that is robust to the accommodation of one unknown structural break.

7 Conclusions

The article has shown that a consistent estimate of the long-run average coefficient can be obtained when cross-section dependence is present among the panel data units. The type of cross-section dependence that is considered in the article is strong, which is accounted for using an approximate common factor model. The model specification is quite flexible and allows for multiple structural breaks that can affect the deterministic component, the cointegrating vector and/or the loadings of the common factors. The estimation procedure that is applied is based on the CCE approach in Pesaran (Citation2006), which approximates the unobserved common factors using cross-section averages of the observable variables of the model. Our result contributes to the literature of nonstationary panel data analysis, where consistent estimation of the parameters of the model is feasible in a spurious regression framework. The article conducts an extensive simulation exercise to study the finite sample performance of the estimator and test statistic that has been proposed in the article.

The application of the procedures that are designed in the article is illustrated with a model that defines a potential relationship between housing prices and real disposable income per capita. The analysis builds upon the use of U.S. regional data that covers a long time period. The main conclusion that is drawn indicates that robust evidence on the existence of a long-run relationship between housing prices and real disposable income per capita can be obtained once the presence of structural breaks that capture relevant events for the U.S. housing market are allowed for in the model specification.

Supplementary Materials

Appendix A. Mathematical appendix: Collects the proofs of the theorems.

Appendix B. Tables of critical values: Presents the tables of critical values for the individual and panel data cointegration test statistics, considering one structural break.

Appendix C. Monte Carlo experiment and empirical illustration: Pooled CCE estimator, break fraction histograms and empirical size and power of the panel cointegration statistics. Plots of the variables used in the empirical application.

Supplemental material

CADFbreak_supplementary.pdf

Download PDF (26.5 MB)

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Carrion-i-Silvestre acknowledges the financial support from the grant PID2020-114646RB-C41 funded by MCIN/AEI/10.13039/501100011033.

Notes

1 A GAUSS computer code is available upon request to estimate the break dates in panel data with up to two structural breaks, although the Bai and Perron (Citation1998) efficient estimation algorithm which allows global minimizers of SSR that is of order O(T2) for any J02 can also be adapted to our panel data framework. In this regard, Ditzen, Karavias and Westerlund (Citation2023) extend the Bai and Perron (Citation1998) methodology to panel data with the development of a toolbox for Stata software using the Bai and Perron (Citation1998) efficient estimation algorithm adapted to panel data.

2 This situation is equivalent to the one found in time series analysis, see Perron (1990).

3 For Models A and B the factor loadings are assume to be constant across regimes so that the sixth and seventh element of the right hand side of (17) are not affected by the structural breaks—that is, j=0Ĵφi,j(DUjA¯)t1 has to be replaced by φiA¯t1 and j=0Ĵl=0pκi,j,l(DUjΔA¯)tl by l=0pκi,lΔA¯tl.

4 We abuse notation when using the Hadamard product in the expressions DUjA¯ and DUjΔA¯ that appear in (17) since, in general, the involved matrices might have different column dimensions. In this case, it should be understood that for any given Am×n and Bm×o matrices AB=[(A1B1)(A1Bo) (AnB1)(AnBo)]m×no. This convention applies throughout the article.

5 An alternative strategy in the case of unknown number of factors, as followed by Pesaran et al. (Citation2013), is to undertake the testing for all permissible values of m0 (using all combinations of m0 cross-section averages for each choice of m0). The size properties of such a procedure are not clear nor are the likely conclusions if one accepts the null hypothesis for some values of m0 and rejects for others. This is a topic for future research.

6 For the empirical implementation of this procedure we can follow Kim and Perron (Citation2009) and use a trimming window of 6 observations.

7 A GAUSS program, available upon request, allows the computation of critical values for the multiple structural breaks case for both the individual and panel data cointegration statistics.

8 The specification of the shrinking factor T1/2δ, δ>0, is motivated by the fact that (λ̂λ0)=Op(T1/2) for Models A2, B2, and C2.

References

  • Ahamada, I., and Diaz-Sanchez, J. L. (2013), “A Retrospective Analysis of the House Price Macro-Relationship in the United States,” International Journal of Central Banking, December, 153–174.
  • Bai, J. (2010), “Common Breaks in Means and Variances for Panel Data,” Journal of Econometrics, 157, 78–92. DOI: 10.1016/j.jeconom.2009.10.020.
  • Bai, J., and Perron, P. (1998), “Estimating and Testing Linear Models with Multiple Structural Changes,” Econometrica, 66, 47–78. DOI: 10.2307/2998540.
  • Bai, J., Kao, C., and Ng, S. (2009), “Panel Cointegration with Global Stochastic Trends,” Journal of Econometrics, 149, 82–99. DOI: 10.1016/j.jeconom.2008.10.012.
  • Bai, J., and Ng, S. (2002), “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70, 191–221. DOI: 10.1111/1468-0262.00273.
  • ——- (2004), “A PANIC Attack on Unit Roots and Cointegration,” Econometrica, 72, 1127–1177.
  • Baltagi, B. H., Feng, Q., and Kao, C. (2016), “Estimation of Heterogeneous Panels with Structural Breaks,” Journal of Econometrics, 191, 176–195. DOI: 10.1016/j.jeconom.2015.03.048.
  • ——- (2019), “Structural Changes in Heterogeneous Panels with Endogenous Regressors,” Journal of Applied Econometrics, 34, 883–892.
  • Banerjee, A., and Carrion-i-Silvestre, J. L. (2015), “Cointegration in Panel Data with Breaks and Cross-Section Dependence,” Journal of Applied Econometrics, 30, 1–23. DOI: 10.1002/jae.2348.
  • ——- (2017), “Testing for Panel Cointegration Using Common Correlated Effects Estimators,” Journal of Time Series Analysis, 38, 610–636.
  • Banerjee, A., Marcellino, M., and Osbat, C. (2005), “Testing for PPP: Should We Use Panel Methods?” Empirical Economics, 30, 77–91. DOI: 10.1007/s00181-004-0222-8.
  • Ditzen, J., Karavias, Y., and Westerlund, J. (2023), “Multiple Structural Breaks in Interactive Effects Panel Data and the Impact of Quantitative Easing on Bank Lending,” working paper arXiv:2211.06707.
  • Gengenbach, C., Urbain, J. P., and Westerlund, J. (2016), “Error Correction Testing in Panels with Common Stochastic Trends,” Journal of Applied Econometrics, 31, 982–1004. DOI: 10.1002/jae.2475.
  • Gerardi, K. S., Harvey, S. R., and Willen, P. S. (2010), “The Impact of Deregulation and Financial Innovation on Consumers: The Case of the Mortgage Market,” The Journal of Finance, LXV, 1, 333–360. DOI: 10.1111/j.1540-6261.2009.01531.x.
  • Holly, S., Pesaran, M. H., and Yamagata, T. (2010), “A Spatio-Temporal Model of House Prices in the USA,” Journal of Econometrics, 158, 160–173. DOI: 10.1016/j.jeconom.2010.03.040.
  • Kim, D. (2011), “Estimating a Common Deterministic Time Trend Break in Large Panels with Cross-Sectional Dependence,” Journal of Econometrics, 164, 310–330. DOI: 10.1016/j.jeconom.2011.06.018.
  • ——- (2014), “Common Breaks in Time Trends for Large Panel Data with a Factor Structure,” Econometrics Journal, 17, 301–337.
  • Kim, D., and Perron, P. (2009), “Unit Root Tests Allowing for a Break in the Trend Function under both the Null and Alternative Hypotheses,” Journal of Econometrics, 148, 1–13. DOI: 10.1016/j.jeconom.2008.08.019.
  • Moon, H. R., and Perron, B. (2008), “Asymptotic Local Power of Pooled t-ratio Tests for Unit Roots in Panels with Fixed Effects,” Econometrics Journal, 11, 80–104. DOI: 10.1111/j.1368-423X.2008.00236.x.
  • Ng, S., and Perron, P. (2001), “Lag Length Selection and the Construction of Unit Root Tests with Good Size and Power,” Econometrica, 69, 1519–1554. DOI: 10.1111/1468-0262.00256.
  • Perron, P. (1989), “The Great Crash, the Oil Price Shock, and the Unit Root Hypothesis,” Econometrica, 57, 1361–1401. DOI: 10.2307/1913712.
  • ——- (1990), “Testing for a Unit Root in a Time Series with a Changing Mean,” Journal of Business & Economic Statistics, 8, 153–162.
  • Perron, P., and Qu, Z. (2007), “A Simple Modification to Improve the Finite Sample Properties of Ng and Perron’s Unit Root Tests,” Economics Letters, 94, 12–19. DOI: 10.1016/j.econlet.2006.06.009.
  • Perron, P., and Zhu, X. (2005), “Structural Breaks with Deterministic and Stochastic Trends,” Journal of Econometrics, 129, 65–119. DOI: 10.1016/j.jeconom.2004.09.004.
  • Pesaran, M. H. (2006), “Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure,” Econometrica, 74, 967–1012. DOI: 10.1111/j.1468-0262.2006.00692.x.
  • ——- (2007), “A Simple Panel Unit Root Test in the Presence of Cross Section Dependence,” Journal of Applied Econometrics, 22, 265–312.
  • ——- (2015), Time Series and Panel Data Econometrics, Oxford: Oxford University Press.
  • Pesaran, M. H., Smith, L. V., and Yamagata, T. (2013), “Panel Unit Root Tests in the Presence of a Multifactor Error Structure,” Journal of Econometrics, 175, 94–115. DOI: 10.1016/j.jeconom.2013.02.001.
  • Phillips, P. C. B., and Moon, H. R. (1999), “Linear Regression Limit Theory for Nonstationary Panel Data,” Econometrica, 67, 1057–1111. DOI: 10.1111/1468-0262.00070.
  • ——- (2000), “Nonstationary Panel Data Analysis: An Overview of Some Recent Developments,” Econometric Reviews, 19, 263–286.
  • Urbain, J. P., and Westerlund, J. (2011), “Least Squares Asymptotics in Spurious and Cointegrated Panel Regressions with Common and Idiosyncratic Stochastic Trends,” Oxford Bulletin of Economics and Statistics, 73, 119–139. DOI: 10.1111/j.1468-0084.2010.00605.x.
  • Westerlund, J. (2018), “CCE in Panels with General Unknown Factors,” Econometrics Journal, 21, 264–276. DOI: 10.1111/ectj.12110.
  • Westerlund, J., and Edgerton, D. (2008), “A Simple Test for Cointegration in Dependent Panels with Structural Breaks,” Oxford Bulletin of Economics and Statistics, 70, 665–704. DOI: 10.1111/j.1468-0084.2008.00513.x.
  • Westerlund, J., Hosseinkouchack, M., and Solberger, M. (2016), “The Local Power of the CADF and CIPS Panel Unit Root Tests,” Econometric Reviews, 35, 845–870. DOI: 10.1080/07474938.2014.977077.