2,092
Views
2
CrossRef citations to date
0
Altmetric
Articles

Detection of Multiple Structural Breaks in Large Covariance Matrices

ORCID Icon, &

ABSTRACT

This article studies multiple structural breaks in large contemporaneous covariance matrices of high-dimensional time series satisfying an approximate factor model. The breaks in the second-order moment structure of the common components are due to sudden changes in either factor loadings or covariance of latent factors, requiring appropriate transformation of the factor models to facilitate estimation of the (transformed) common factors and factor loadings via the classical principal component analysis. With the estimated factors and idiosyncratic errors, an easy-to-implement CUSUM-based detection technique is introduced to consistently estimate the location and number of breaks and correctly identify whether they originate in the common or idiosyncratic error components. The algorithms of Wild Binary Segmentation for Covariance (WBS-Cov) and Wild Sparsified Binary Segmentation for Covariance (WSBS-Cov) are used to estimate breaks in the common and idiosyncratic error components, respectively. Under some technical conditions, the asymptotic properties of the proposed methodology are derived with near-optimal rates (up to a logarithmic factor) achieved for the estimated breaks. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the developed method and its comparison with other existing approaches. We finally apply our method to study the contemporaneous covariance structure of daily returns of S&P 500 constituents and identify a few breaks including those occurring during the 2007–2008 financial crisis and the recent coronavirus (COVID-19) outbreak. An R package “BSCOV” is provided to implement the proposed algorithms.

1 Introduction

Estimation of covariance matrices is of fundamental importance in modern multivariate statistics, and has applications in various fields such as biology, economics, finance, and social networks. Suppose that (Xt:t=1,,n) is a collection of n observations from a d-dimensional random vector with E(Xt)=0 and E(XtXt)=Σ, where 0 is a null vector whose size may change from line to line and Σ is a d × d positive definite covariance matrix. When the dimension d is fixed or significantly smaller than the sample size n, the population covariance matrix Σ can be well estimated by the conventional sample covariance matrix (see Anderson Citation2003):(1.1) Σ¯n=1nt=1n(XtX¯n)(XtX¯n),X¯n=1nt=1nXt.(1.1)

However, when d exceeds n, the sample covariance matrix Σ¯n defined in (1.1) becomes singular. Estimation of such a large covariance matrix is generally challenging and has received increasing attention in recent years. Some techniques have been introduced to regularize covariance matrices and produce reliable estimation (see Wu and Pourahmadi Citation2003; Ledoit and Wolf Citation2004; Bickel and Levina Citation2008a, Citation2008b; El Karoui Citation2008a, Citation2008b; Lam and Fan 2009; Rothman, Levina, and Zhu Citation2009; Cai and Liu Citation2011; Fan, Liao, and Mincheva Citation2013; Huang and Fryzlewicz Citation2019). A comprehensive review on large covariance matrix estimation can be found in Pourahmadi (Citation2013), Cai, Ren, and Zhou (Citation2016) and Fan, Liao, and Liu (Citation2016). All of the aforementioned literature assumes that the large covariance matrix is a constant matrix throughout the entire course of data collection, which may be too restrictive in practical situations. Some recent papers including Chen, Xu, and Wu (Citation2013) and Chen and Leng (Citation2016) attempt to relax this restriction and consider a large dynamic covariance matrix estimation by allowing the covariance matrix to evolve smoothly over time (or with an index variable).

The main interest of this article is to detect and estimate multiple structural breaks in large covariance matrices, to which setting the methodology in Chen, Xu, and Wu (Citation2013) and Chen and Leng (Citation2016) is no longer applicable. Structural breaks are very common in many areas such as economics and finance, and may occur for various reasons. If such abrupt structural changes are ignored in covariance matrix estimation, subsequent statistical analysis would lead to invalid inference and misleading conclusions. The classical Binary Segmentation (BS) technique is commonly used to detect structural breaks, and has been extensively studied since its introduction by Vostrikova (Citation1981). For instance, in the setting of univariate mean regression models, theoretical properties, computational algorithms and relevant empirical applications of the traditional BS and its generalized version have been systematically studied by Venkatraman (Citation1992), Bai (Citation1997), Cho and Fryzlewicz (Citation2012), Killick, Fearnhead, and Eckley (Citation2012), Fryzlewicz (Citation2014, Citation2020) and the references therein. In recent years, there has been increasing interest in extending this technique to high-dimensional settings such as high-dimensional time series and large panel data (see Cho and Fryzlewicz Citation2015; Jirak Citation2015; Cho Citation2016; Aston and Kirch Citation2018; Wang and Samworth Citation2018; Wang et al. Citation2019; Safikhani and Shojaie Citation2022). However, most of the aforementioned literature mainly considers break detection and estimation in the first-order moment structure. Aue et al. (Citation2009) use the BS method with mean-corrected cumulative sum (CUSUM) to detect structural breaks in fixed dimensional covariance matrices, and establish the limit distributions for the test statistics and the convergence rate for the estimated break date. Korkas and Fryzlewicz (Citation2017) detect breaks in the second-order moment structure of time series by combining the wavelet-based transformation and the univariate Wild Binary Segmentation (WBS) algorithm proposed by Fryzlewicz (Citation2014). However, their technique is not directly applicable to our case when the size of the covariance matrix diverges to infinity as the sample size n increases.

In this article, we consider the second-order moment structure of the high-dimensional random vector Xt generated by an approximate factor model, that is,(1.2) Xt=ΛFt+ϵt,t=1,2,,n,(1.2) where Λ=(λ1,,λd) is a matrix of factor loadings with λi being an r-dimensional vector, Ft is an r-dimensional vector of common factors, and ϵt=(ϵt1,,ϵtd) is a d-dimensional vector of idiosyncratic components uncorrelated with Ft. The approximate factor model is very popular in economics and finance, and has become an effective tool in analyzing high-dimensional time series (see Chamberlain and Rothschild Citation1983; Fama and French Citation1992; Stock and Watson Citation2002; Bai and Ng Citation2002, Citation2006; Chen et al. Citation2018). Our main focus is the contemporaneous second-order moment structure, which plays a crucial role in various fields such as the classic mean-variance portfolio choice theory and risk management. Note that(1.3) Σ=cov(Xt)=Λcov(Ft)Λ+cov(ϵt)=Σ(Λ,F)+Σ(ϵ),(1.3) where Σ(ϵ) is the covariance matrix of the idiosyncratic components ϵt and Σ(Λ,F) is the covariance matrix of the common components ΛFt. In this article, we aim to study multiple structural breaks in the covariance structure of Xt generated by the approximate factor model, estimating the break points and number of breaks. Structural breaks may occur in either the covariance structure of the common components or that of the idiosyncratic components, indicating that the constant covariance structure in (1.3) is replaced by the following time-varying version:(1.4) Σt=Σt(Λ,F)+Σt(ϵ),t=1,,n.(1.4)

The main contributions of this article, the fundamental novelty of the proposed methodology and its connection to some recent literature are summarized as follows.

  • A transformation mechanism is introduced to convert the approximate factor model with multiple structural breaks to that with constant factor loadings, justifying applicability of the classical Principal Component Analysis (PCA) technique (see Bai and Ng Citation2002; Stock and Watson Citation2002) to a more general model setting. Note that breaks in the second-order moment structure of the common components are due to changes in factor loadings, covariance of latent factors or factor number. This makes it challenging to directly estimate the latent common factors and factor loadings via PCA. When there are multiple breaks in factor loadings, we transform the original factor model to one with time-invariant factor loadings and a time-varying covariance structure for transformed factors, and then use the PCA method to estimate the transformed factors and factor loadings, which are shown to be consistent with convergence rates comparable to those in the existing literature on PCA for approximate factor models without breaks (see Bai and Ng Citation2002; Fan, Liao, and Mincheva Citation2013). This transformation technique is also used by Han and Inoue (Citation2015) and Baltagi, Kao, and Wang (Citation2017, Citation2021) to estimate and test factor models with structural instability. However, they assume that the number of factors does not change. In contrast, we consider a more general setting, allowing changes of factor numbers and multiple breaks in the covariance matrices of idiosyncratic components.

  • With the estimated factors and idiosyncratic errors, we propose an easy-to-implement CUSUM-based detection technique to estimate the location and number of breaks and identify whether they originate in the common or idiosyncratic components. The developed CUSUM statistics are directly computed via the empirical second moments of the estimated factors and idiosyncratic errors. Based on Fryzlewicz’s (Citation2014) WBS algorithm, we propose a Wild Binary Segmentation for Covariance (WBS-Cov) to locate multiple breaks in the common components. For breaks in the idiosyncratic error components, we introduce a Wild Sparsified Binary Segmentation algorithm for Covariance (WSBS-Cov), combining WBS-Cov and the Sparsified Binary Segmentation (SBS) proposed by Cho and Fryzlewicz (Citation2015) in the high-dimensional mean estimation context. A recent paper by Barigozzi, Cho, and Fryzlewicz (Citation2018) combines the wavelet-based transformation and the double CUSUM technique (Cho Citation2016) to estimate the break time as well as break number within the second-order moment structure. However, a disadvantage of the wavelet-based transformation on the piecewise-constant “signals,” involving selection of the wavelet filter and scale parameter, is that it could destroy the piecewise constancy and subsequently affect the estimation accuracy of the break number and location. When we only focus on a specific feature of the data, data transformation, even one-to-one transformation, would distort the interpretation of breaks of the structure. Another recent method to deal with sparsity in high-dimensional break detection is the sparse projection introduced by Wang and Samworth (Citation2018) but their method only detects breaks in the first-order moment structure. Wang, Yu, and Rinaldo (Citation2021) extends the BS and WBS techniques to detect multiple breaks in large covariance matrices, but they limit attention to the independent sub-Gaussian random vector and assume that the dimension is divergent at a slow polynomial rate of n. In contrast, we allow the dimension to grow faster, for example, at an exponential rate of n, a typical setting in big data analysis.

  • Simulation studies are implemented to assess the finite-sample performance of the proposed method and its comparison with other competing methods. The simulation results show that our method has superior numerical performance in most settings, whereas the method by Barigozzi, Cho, and Fryzlewicz (Citation2018) tends to perform poorly when the number of covariance matrix entries with breaks is small relative to the matrix size. Our simulation also shows that the proposed methods are robust when the factor number is over-estimated.

  • The developed model and method are applied to the daily return series of S&P 500 constituent which contain 375 firms over the period from January 01, 2000 to February 28, 2021, and detect five breaks in the covariance structure of the common components and three breaks in the idiosyncratic error components. In particular, we identify the breaks occurring during the 2007–2008 global financial crisis and the recent coronavirus (COVID-19) outbreak. With an additional comparison with break detection result of low-dimensional observable risk factor series, we emphasize that the breaks in covariance structure of returns cannot be fully explained by the breaks in the covariance structure of observed risk factors nor the breaks in loadings of these factors.

The rest of the article is organized as follows. Section 2 introduces the setting of multiple structural breaks, the PCA estimation and CUSUM-based detection methods, and the WBS-Cov and WSBS-Cov algorithms. Section 3 presents the asymptotic theory for the developed WBS-Cov and WSBS-Cov methods. Section 4 discusses some practical issues in the detection procedure. Sections 5 and 6 give the simulation studies and the empirical application, respectively. Section 7 concludes the article. The technical assumptions are given in Appendix A and proofs of the main theoretical results are available in Appendices C–E of a supplementary materials. An R package “BSCOV” for implementing the proposed methods is available from https://github.com/markov10000/BSCOV. Throughout the article, we let ||·||2 and ||·||F denote the Euclidean norm of a column vector and the Frobenius norm of a matrix, respectively; let vech(·) denote the half vectorization of a symmetric matrix obtained by vectorizing only the lower triangular part of the matrix; let · denote the floor function; let ab and ab denote max{a,b} and min{a,b}, respectively; and let anbn denote that an=O(bn) and bn=O(an) hold jointly.

2 Methodology

In this section, we first introduce the setting of multiple structural breaks in the contemporaneous covariance structure of both the common and idiosyncratic components, and transform the approximate factor models to construct PCA estimation of the latent common factors and factor loadings (with appropriate rotation). Using the estimated factors and idiosyncratic errors, we then introduce the CUSUM-based statistics to estimate the location and number of structural breaks as well as the WBS-Cov and WSBS-Cov algorithms.

2.1 Model Structure for Multiple Structural Breaks

We consider the following multiple structural breaks in the covariance matrix of the common components, that is, Σt(Λ,F) in (1.4) is specified as(2.1) Σt(Λ,F)={​​​Σ10(Λ,F),1tη1c,Σ20(Λ,F),η1c+1tη2c,ΣK1+10(Λ,F),ηK1c+1tn,(2.1) where Σk0(Λ,F)Σk+10(Λ,F) for k=1,,K1, and η1c,,ηK1c are unobservable breaks with the superscript “c” denoting the “common component.” For convenience, we let η0c=0 and ηK1+1c=n. The above structural breaks may be caused by one or a combination of the following cases,

  • Case (i): sudden changes in the factor loadings;

  • Case (ii): sudden changes in the covariance matrix for common factors;

  • Case (iii): sudden changes in the number of factors.

Let Λk0 be a d×rk factor loading matrix and Ft,k be the corresponding factor vector with dimension rk when ηk1c+1tηkc, and write the approximate factor model as(2.2) Xt=Λk0Ft,k+ϵt,ηk1c+1tηkc.(2.2)

Letting Σk0(Λ,F)=Λk0cov(Ft,k)(Λk0), the structural break structure (2.1) can be equivalently written asΣt(Λ,F)={Λ10cov(Ft,1)(Λ10),1tη1c,Λ20cov(Ft,2)(Λ20),η1c+1tη2c,ΛK1+10cov(Ft,K1+1)(ΛK1+10),ηK1c+1tn.

Note that, although Σk0(Λ,F)Σk+10(Λ,F) for k=1,,K1,Λk0 may be the same as Λk+10 (when cases (i) and (iii) do not occur at the break time). It is worthwhile to point out that case (i) is similar to the approximate factor model with structural breaks; see, for example, Breitung and Eickmeier (Citation2011), Chen, Dolado, and Gonzalo (2014), Han and Inoue (Citation2015), Cheng, Liao, and Schorfheide (Citation2016), Baltagi, Kao, and Wang (Citation2017), Bai, Han, and Shi (Citation2020), and Duan, Bai, and Han (Citation2022). However, the aforementioned papers consider the case of a single structural break in the factor loadings and usually assume that the covariance of the idiosyncratic error components is time invariant. Ma and Su (Citation2018) consider detecting and estimating multiple structural breaks in the factor loadings with a three-step procedure using the adaptive fused group Lasso. Case (iii) is an important type of structure break in the factor model, which has received much attention in recent years (see Baltagi, Kao, and Wang Citation2017; Barigozzi, Cho, and Fryzlewicz Citation2018; Li et al. Citation2019). Section 2.2 will give a unified factor model framework covering cases (i)–(iii) which facilitates the construction of PCA estimation, and then proceed to introduce the relevant CUSUM statistics in Section 2.3 to detect the multiple structural breaks in the common component.

The structural breaks in the covariance matrix of the error components are specified as followsΣt(ϵ)={Σ10(ϵ),1tη1e,Σ20(ϵ),η1e+1tη2e,ΣK2+10(ϵ),ηK2e+1tn,and Σk0(ϵ)Σk+10(ϵ) for k=1,,K2, where η1e,,ηK2e are unobservable breaks with the superscript “e” denoting the “idiosyncratic error component.” For convenience, we let η0e=0 and ηK2+1e=n.

Let(2.3) Bc={η1c,,ηK1c}andBe={η1e,,ηK2e}.(2.3)

In this article, we assume that the minimum length of the subintervals separated by ηkc is of order n, see Assumption 4(ii) in Appendix A. Such a restriction is mainly to ensure that the PCA method is applicable to estimate the latent factors and factor loadings. In contrast, the minimum length of the subintervals separated by ηke is allowed to be of order smaller than n, see Assumption 4(iii). We do not impose any restriction on how the break points in Bc and those in Be are located with respect to each other.

2.2 Transformed Factor Models and PCA Estimation

Although the multiple structural breaks formulated in (2.1) and (2.2) may be caused by one or a combination of cases (i)–(iii) described in Section 2.1, we next show that multiple structural breaks in the factor loadings and factor number can be transformed to breaks in the factor covariance structure.

Proposition 2.1.

The approximate factor model (2.2) can be equivalently written as a transformed factor model like(2.4) Xt=ΛFt+ϵt,t=1,,n,(2.4) where Λ denotes the transformed factor loading matrix which is time invariant, and Ft denotes the transformed factors. In addition, the number of factors in the original factor model and that in the transformed factor model, satisfy the following inequalities,(2.5) max1kK1+1r¯kq0k=1K1+1r¯kk=1K1+1rk,(2.5) where q0 is the number of transformed factors, r¯k is the column rank of Λk0 and rk is the number of original factors Ft,k when ηk1c+1tηkc.

Remark 2.1.

A representation similar to Proposition 2.1 is also given in Han and Inoue (Citation2015) and Baltagi, Kao, and Wang (Citation2017, Citation2021). Note that construction of Λ in the transformation mechanism is not unique due to the factor model identification issue. When each of Λk0 is of full column rank, we have r¯k=rk, and consequently the inequalities in (2.5) would be simplified tomax1kK1+1rkq0k=1K1+1rk.

The upper bound for q0 can be achieved if Λk0,k=1,,K1+1, are linearly independent. Appendix B in the supplementary materials gives a simple motivating example for the above transformation. Proposition 3.1 in Section 3.1 will further explore the limiting behavior of 1nt=1nFtFt⋆⊺ as n, which is crucial to application of the classic PCA method to estimate the transformed factors and factor loadings. Through the above transformation, it is sufficient to construct the CUSUM statistic using the estimated transformed factors for all of cases (i)–(iii) discussed in Section 2.1.

As the common factors are latent, we have to obtain their estimates (subject to appropriate rotation) in practice. By Proposition 2.1, we consider the transformed factor model like (2.4) with time-invariant factor loadings, and apply the PCA estimation technique. Let Xn=(X1,,Xn) be an n × d matrix of observations. For the time being, we assume that the number of transformed factors q0 is known, and will discuss how to determine it later in this section. The estimated factors F̂t as well as the estimated idiosyncratic errors ϵ̂t will be used to construct the CUSUM statistics in Sections 2.3 and 2.4. The algorithm for PCA estimation is given as follows.

Algorithm 1

Standard PCA

Input: Xn, q0 1. Let F̂n=(F̂1,,F̂n) be the n×q0 matrix consisting of the q0 eigenvectors (multiplied by n) corresponding to the q0 largest eigenvalues of XnXn/(nd).

2. The transformed factor loading matrix is estimated as(2.6) Λ̂n=XnF̂n/n=:(λ̂1,,λ̂d).(2.6)

3. The idiosyncratic component ϵt is approximated by(2.7) ϵ̂t=XtΛ̂nF̂t=:(ϵ̂t1,,ϵ̂td)withϵ̂tj=Xtjλ̂jF̂t,1jd,(2.7) where Xtj is the jth element of Xt.

Output: F̂t,ϵ̂t for t=1,,n

A key issue in the PCA estimation is to appropriately choose the number of latent factors. As the above PCA method focuses on the transformed factor model, our aim is to estimate the number of transformed factors rather than that of the original ones. In general, the number of transformed factors q0 could be much larger than maximum of r¯k (or rk) over k=1,K1+1. However, when both (rk,k=1,K1+1) and K1 are assumed to be fixed, from (2.5) in Proposition 2.1, we have the true number of transformed factors to be upper bounded by a finite positive integer denoted by Q¯. Some existing criteria developed for stable factor models can be applied to the transformed factor model (2.4) to obtain a consistent estimate of q0. In this article, we use a commonly used information criterion proposed by Bai and Ng (Citation2002).

For any 1qQ¯, we let F̂n(q)=[F̂1(q),F̂n(q)] be the estimated factors obtained in Algorithm 1 replacing q0 by q. DefineVn(q)=minΛ(q)1ndj=1dt=1n[Xtjλj(q)F̂t(q)]2=minΛ(q)1ndt=1n[XtΛ(q)F̂t(q)][XtΛ(q)F̂t(q)]=j=q+1dμj(XnXn/(nd)),where Λ(q)=[λ1(q),,λd(q)] is a d × q factor loading matrix and μj(·) denotes the jth largest eigenvalue. Consequently, we can choose the following objective function:(2.8) IC(q)=log[Vn(q)]+q·(n+dnd)log(nd),(2.8) and obtain the estimate q̂ via(2.9) q̂=arg min0qQ¯IC(q),(2.9) where the common components disappear when q = 0. Alternative IC functions with different penalty terms in (2.8) can be found in Bai and Ng (Citation2002) and in Alessi, Barigozzi, and Capasso (Citation2010) for the case of large idiosyncratic disturbances. With the asymptotic results given in Theorem 2 of Bai and Ng (Citation2002), we may show that q̂ is a consistent estimator of the true number q0, indicating that q̂=q0 whose probability converges to one as the sample size increases.

Although accurate estimation of the factor number may be difficult in practice, slight over-estimation of the (transformed) factor number may be not risky in break detection (see Barigozzi, Cho, and Fryzlewicz Citation2018). In Appendix F of the supplementary materials, we provide a simulation to show that over-estimating q0 would not negatively affect accuracy of multiple break detection. It is possible to avoid precisely estimating the factor number by following Barigozzi, Cho, and Fryzlewicz’s (Citation2018) method, multiplying the estimated factors and factor loadings to obtain the estimated common components. However, this would lead to a high-dimensional break detection for the common components.

2.3 Break Detection in the Common Components

We start with the Binary Segmentation for Covariance (BS-Cov) technique to detect breaks in the common components and then introduce the WBS-Cov algorithm. As breaks in the factor loadings and factor number can be transformed to those in the covariance of transformed factors after reformulating the approximate factor model to the one with time-invariant factor loadings, we next define a CUSUM statistic using the estimated factors F̂t. For 1ls<un, define(2.10) Cl,uF̂(s)=(sl+1)(us)ul+1[1sl+1t=lsvech(F̂tF̂t)1ust=s+1uvech(F̂tF̂t)],(2.10)

which is a column vector with dimension q0(q0+1)/2. Maximizing Cl,uF̂(s)2 with respect to s, we obtain(2.11) η˜1c=arg maxls<uCl,uF̂(s)2,(2.11) the first candidate break point (which is not necessarily the estimate of the first break time point η1c). If the quantity Cl,uF̂(η˜1c)2 exceeds certain threshold, say ξnc, we split the closed interval [l,u] into two subintervals: [l,η˜1c] and [η˜1c+1,u]. Then, letting (l,u)=(l,η˜1c) or (l,u)=(η˜1c+1,u), we compute Cl,uF̂(s) with s in either of the two subintervals, maximize Cl,uF̂(s)2 with respect to s, estimate the next candidate break points, and examine whether a further split is needed. We continue this until no more split is needed.

Remark 2.2.

  1. It may be possible to extend (2.11) to a weighted version (see Hariz, Wylie, and Zhang Citation2007; Cho Citation2016; Yu and Chen Citation2021)η˜1c=arg maxls<u{[(sl+1)(us)(ul+1)2]υCl,uF̂(s)2},

with 1/2<υ1/2. A smaller υ may help detect breaks that are close to boundary. For simplicity, we only consider the unweighted version, that is, υ=0, otherwise υ would affect the asymptotic analysis.

  1. In the above BS-Cov, we use the l2-type aggregation of the CUSUM quantities. Alternatively, we may use different norms in the aggregation such as the l1-norm and l-norm. It may be also possible to consider a weighted aggregation and seek an optimal projection direction as recommended by Wang and Samworth (Citation2018). However, it is not clear how to implement the optimal direction method in a computationally efficient way for break detection in our model setting, so we do not pursue it in the present article. In Appendix F of the supplementary materials, we compare numerical performance among different types of aggregation on the CUSUM quantities and find that the l2-type aggregation produces the most accurate detection result.

  2. The l2-aggregation in (2.11) ignores possible correlation structure between elements of the vector CUSUM statistic. Such a simplification would not affect estimation consistency of the break number and locations. For break detection in fixed-dimensional covariance matrices, Aue et al. (Citation2009) propose a weighted quadratic form of the CUSUM vector using an estimated long-run covariance matrix. This refinement in the l2-aggregation may improve break estimation efficiency. However, consistent estimation of the long-run covariance structure for vech(F̂tF̂t) is difficult to obtain when there are unknown multiple breaks.

In order to accurately estimate the break time and delete possible spurious spikes in the CUSUM statistics, instead of using BS-Cov, we next extend the WBS algorithm introduced by Fryzlewicz (Citation2014) to detect the multiple structural breaks in the common components. Let η̂1c,,η̂K̂1c be the estimated break points (arranged in an increasing order) and K̂1 be the estimate of the unknown break number K1 obtained in the following WBS-Cov algorithm. A discussion on selection of the threshold ξnc can be found in Section 4. The method to draw random intervals in the WBS-Cov algorithm is the same as that in Fryzlewicz (Citation2014) (see also Algorithm 4 in Section 4). Fryzlewicz (Citation2020) provides an alternative recursive method of drawing random intervals in the WBS algorithm. In general, the intervals drawn in Algorithm 2 do not have to be random but can be taken over a fixed grid.

Algorithm 2

WBS-Cov algorithm

Input: ξnc,Mnc,F̂t,t=1,,n generate random intervals [lm,um] within [1, n] for m=1,,Mncset Wc={[1,n]} and B̂c= while Wc do take an element out of Wc and denote it as [l,u] Ml,uc={m:[lm,um][l,u]}(m0c,s0c)=arg maxmMl,uc,lms<umCl,uF̂(s)2 if Clm0c,um0cF̂(s0c)2>ξnc then add s0c to B̂c add [l,s0c] and [s0c+1,u] to Wc end if end while Output: B̂c

2.4 Break Detection in the Idiosyncratic Components

We next turn to detection of multiple structural breaks in the covariance structure of the idiosyncratic vector, which is much more challenging as the dimension involved can be ultra large and ϵt is unobservable. In the existing literature, to obtain reliable estimation of ultra-high dimensional covariance matrices (without breaks), it is often common to impose certain sparsity condition, limiting the number of nonzero entries. Consequently, the thresholding or other generalized shrinkage methods (Bickel and Levina Citation2008a; Rothman, Levina, and Zhu Citation2009; Cai and Liu Citation2011) have been developed to estimate high-dimensional covariance matrices. A similar thresholding technique is also introduced by Cho and Fryzlewicz (Citation2015) to detect multiple structural breaks in high-dimensional time series setting. We next generalize the latter to detect breaks in the large covariance structure of the idiosyncratic error vector, using the approximation of ϵt obtained in the PCA algorithm. Define the normalized CUSUM-type statistic using ϵ̂tj defined in (2.7):(2.12) cl,uϵ̂,σ̂(s;i,j)=1σ̂l,u(i,j)(sl+1)(us)(ul+1)(1sl+1t=lsϵ̂tiϵ̂tj1ust=s+1uϵ̂tiϵ̂tj)(2.12) for 1ls<un and 1i,jd, where σ̂l,u(i,j) is a properly chosen scaling factor. The value of the scaling factor depends on both the time interval [l,u] and the index pair (i, j). The scaling factors ensure that we can choose a common threshold for each index pair. Note that the scaling factor is not needed for the CUSUM statistic (2.10) to detect breaks in the common components since the estimated factors have been normalized. We choose σ̂l,u(i,j) as the differential error median absolute deviation using ϵ̂tj, whose detailed construction is given in Section 4. To derive the consistency result for break detection, we assume that the random scaling factors are uniformly bounded away from zero and infinity over 1i,jd and 1l<un with probability tending to one (see Assumption 5 in Appendix A). As the dimension d and might be much larger than the sample size n (see Assumption 4(i)), a direct aggregation of the CUSUM statistics cl,uϵ̂,σ̂(s;i,j) over i and j might not perform well in detecting the breaks in particular when the sparsity condition is imposed. Hence, we compare cl,uϵ̂,σ̂(s;i,j) with a thresholding parameter, say ξne, and delete the index pair (i, j) when maxlt<u|cl,uϵ̂,σ̂(t;i,j)| is smaller than ξne. For mMl,ue (to be defined in Algorithm 3) such that [lm,um] is a random subinterval of [l,u], we define(2.13) Clm,umϵ̂(s)=i=1dj=id|clm,umϵ̂,σ̂(s;i,j)|2I(maxlt<u|cl,uϵ̂,σ̂(t;i,j)|>ξne),(2.13) where I(·) is an indicator function. It is easy to see that the truncation component in (2.13) is independent of the random subintervals [lm,um]. Therefore, if maxlt<u|cl,uϵ̂,σ̂(t;i,j)|ξne, the algorithm will no longer search for breaks within the interval [l,u) for the index pair (i, j). The main difference between our method and that in Cho and Fryzlewicz (Citation2015) is that they sparsify the classical Binary Segmentation to locate the break points and estimate the break number, whereas we combine the sparsified CUSUM quantity with the WBS-Cov algorithm introduced as in Section 2.3. Thus, we call the proposed algorithm the Wild Sparsified Binary Segmentation for Covariance or WSBS-Cov. The detailed algorithm is given as follows.

Algorithm 3

WSBS-Cov algorithm

Input: ξne,Mne,ϵ̂t,t=1,,n generate random intervals [lm,um] within [1,n] for m=1,,Mneset We={[1,n]} and B̂e= while We do take an element out of We and denote it as [l,u] Ml,ue={m:[lm,um][l,u]}(m0e,s0e)=arg maxmMl,ue,lms<umClm,umϵ̂(s) if Clm0e,um0eϵ̂(s0e)>0 then add s0e to B̂e add [l,s0e] and [s0e+1,u] to We end if end while Output: B̂e

Note that the sparsified CUSUM statistic Clm,umϵ̂(s) cannot take any value between 0 and (ξne)2 andClm0e,um0eϵ̂(s0e)>0Clm0e,um0eϵ̂(s0e)>(ξne)2.

Hence, the thresholding parameter ξne in WSBS-Cov plays a similar role to ξnc in WBS-Cov. With the WSBS-Cov algorithm, we obtain the estimated break points denoted by η̂1e,,η̂K̂2e (arranged in an increasing order), where K̂2 is the estimate of the break number K2. Finally, the sets of break points Bc and Be defined in (2.3) are estimated byB̂c={η̂1c,,η̂K̂1c}andB̂e={η̂1e,,η̂K̂2e}.

3 Large-Sample Theory

In this section we establish the large-sample theory for the methods proposed in Section 2 under some technical assumptions listed in Appendix A. The asymptotic properties for the WBS-Cov and WSBS-Cov methods are given in Sections 3.1 and 3.2, respectively.

3.1 Asymptotic Theory of WBS-Cov for the Common Components

We start with the following proposition on some fundamental properties for the transformed factors Ft and factor loadings Λ in (2.4).

Proposition 3.1.

Suppose that Assumption 2 in Appendix A is satisfied and let κnc=min1kK1+1(ηkcηk1c). The transformed factor loading matrix Λ is of full column rank and the transformed factors Ft satisfy that t=1nFtFt⋆⊺F=OP(n) when κnc. If, in addition, κncn, there exists a q0×q0 positive definite matrix ΣF such that(3.1) 1nt=1nFtFt⋆⊺PΣFasn.(3.1)

The above proposition shows that the sample second-order moment of the transformed factors converges in probability under some regularity conditions in Assumption 2. The convergence result (3.1) is sometimes given directly as a high-level condition in the literature (e.g., Assumption A1 in Ma and Su Citation2018). The restriction κncn is crucial to ensure positive definiteness of ΣF. If the order of the minimum distance between breaks is smaller than n, the limit matrix ΣF may be singular.

We next study the asymptotic theory for the WBS-Cov algorithm in Section 2.3 to detect breaks in the common components. Define an infeasible CUSUM statistic using the latent transformed factors:(3.2) Cl,uHF(s)=(sl+1)(us)ul+1[1sl+1t=lsvech(HFtFtH)1ust=s+1uvech(HFtFtH)],(3.2) and H is a q0×q0 rotation matrix defined by(3.3) H=Ωq01(1nt=1nF̂tFt⋆⊺)(1dj=1dλjλj⋆⊺),(3.3) in which Ωq0 is a q0×q0 diagonal matrix with the diagonal elements being the first q0 largest eigenvalues of XnXn/(nd) arranged in a descending order, and Λ=(λ1,,λd) with λj being a q0-dimensional vector of transformed factor loadings.

Proposition 3.2.

Suppose that Assumptions 1–3 and 4(i) in Appendix A are satisfied. If κncn,(3.4) max(l,u):1l<unmaxs:ls<uCl,uF̂(s)Cl,uHF(s)2=OP(1).(3.4)

Proposition 3.2

plays a crucial role in our proofs, indicating that the CUSUM statistic Cl,uF̂(s) can be replaced by the infeasible CUSUM statistic Cl,uHF(s) in the asymptotic analysis. The following theorem establishes the convergence result for the estimates of break points and break number in the covariance structure of the common components.

Theorem 3.1.

Suppose that Assumptions 1–3 and 4(i)(ii) in Appendix A are satisfied, and there exist two positive constants c¯1 and c¯1 such that the threshold ξnc satisfies(3.5) c¯1log2nξncc¯1(κncω¯nc)1/2,(3.5) where κnc is defined in Proposition 3.1 and ω¯nc denotes the minimum break size (in the common components) defined in Assumption 4(ii). Then there exists a positive constant ιc such that(3.6) P(K̂1=K1;max1kK1|η̂kcηkc|<ιcφnc)1,(3.6) where φnc=log4n/ω¯nc.

Remark 3.1.

The convergence result (3.6) shows that the proposed estimator of the break points in the common components via WBS-Cov has the approximation rate of φnc=log4n/ω¯nc, which can be further simplified to the rate of log4n if the minimum break size ω¯nc is larger than a small positive constant. Note that, even if the factors are observable, the optimal estimation rate of the break points in their covariance structure is OP(1) (Korostelev Citation1988; Aue et al. Citation2009) when the binary segmentation is used in break detection. Hence, our approximation rate in Theorem 3.1 is nearly optimal up to a fourth-order logarithmic rate. Since we remove the Gaussian assumption on the observations and further allow temporal dependence in the data over time, our approximation rate is slightly slower than that in Theorem 3.2 of Fryzlewicz (Citation2014).

3.2 Asymptotic Theory of the WSBS-Cov for the Idiosyncratic Components

We next present the asymptotic theory of structural break estimation in the idiosyncratic error components. Similarly to that in Section 3.1, we define the following infeasible CUSUM-type statistics using the unobservable ϵtj,(3.7) cl,uϵ,σ̂(s;i,j)=1σ̂l,u(i,j)(sl+1)(us)(ul+1)(1sl+1t=lsϵtiϵtj1ust=s+1uϵtiϵtj)(3.7) for 1ls<un and 1i,jd, where σ̂l,u(i,j) is defined as in (2.12), and(3.8) Clm,umϵ(s)=i=1dj=id|clm,umϵ,σ̂(s;i,j)|2I(maxlt<u|cl,uϵ,σ̂(t;i,j)|>ξne)(3.8) for mMl,ue.

Proposition 3.3.

Suppose that Assumptions 1–3, 4(i), and 5 in Appendix A are satisfied. If κncn,(3.9) max(i,j):1i,jdmax(l,u):1l<unmaxs:ls<u|cl,uϵ̂,σ̂(s;i,j)cl,uϵ,σ̂(s;i,j)|=OP((logd)(logn)).(3.9)

Proposition 3.3

plays a crucial role in our asymptotic analysis, indicating that the CUSUM statistics cl,uϵ̂,σ̂(s;i,j) can be replaced by the infeasible ones cl,uϵ,σ̂(s;i,j), which are much easier to handle in the proofs. The following theorem establishes the convergence result for the estimates of the break points and break number in the covariance structure of the idiosyncratic components.

Theorem 3.2.

Suppose that Assumptions 1–3, 4(i)(iii), and 5 in Appendix A are satisfied, κncn, and there exist two positive constants c¯2 and c¯2 such that(3.10) c¯2log2(dn)ξnec¯2(κne/n)·(κneω¯ne)1/2,(3.10) where κne=min1kK2+1(ηkeηk1e) and ω¯ne denotes the minimum break size (in the idiosyncratic components) defined in Assumption 4(iii). Then there exists a positive constant ιe such that(3.11) P(K̂2=K2;max1kK2|η̂keηke|<ιeφn,de)1,(3.11) where φn,de=(n/κne)2·[log4(nd)/ω¯ne].

Remark 3.2.

The rate in (3.11) relies on both n and d, making it substantially different from the convergence result in Theorem 3.1. If the dimension diverges to infinity at a polynomial rate of n (see Barigozzi, Cho, and Fryzlewicz Citation2018) and assume κnen, the rate φn,de becomes log4n/ω¯ne, slightly faster than that obtained in Theorem 3 of Barigozzi, Cho, and Fryzlewicz (Citation2018) when the breaks are sparse in covariance of the idiosyncratic errors.

4 Practical Issues in the Detection Procedure

In this section, we first extend the so-called Strengthened Schwarz Information Criterion (SSIC) to our model setting to determine the number of breaks, and then discuss the choice of the random scaling quantity σ̂l,u(i,j) in (2.12) and the thresholding parameter ξne in the WSBS-Cov.

The SSIC is proposed by Fryzlewicz (Citation2014) in the context of univariate WBS, and some modifications are needed to make it applicable to our setting. For the break detection in the covariance structure of the common components, when the algorithm proceeds, we estimate the k candidate break points (arranged in an increasing order) denoted by η̂1|kc,,η̂k|kc with the convention of η̂0|kc=0 and η̂k+1|kc=n. LettingZtF^=vech(F^tF^t)=(Zt,1F^,,Zt,q0(q0+1)/2F^)andZ,jF^=(Z1,jF^,,Zn,jF^)for j=1,,q0(q0+1)/2, we define the SSIC objective function for the univariate process Z,jF^ by(4.1) SSICjc(k)=n2logσ̂j2(k)+kp(n),k=0,,K¯,(4.1) whereσ̂j2(k)=1nt=1n[Zt,jF^Z¯t,jF^(k)]2withZ¯t,jF^(k)=1η̂i+1|kcη̂i|kcs=η̂i|kc+1η̂i+1|kcZs,jF^ for η̂i|kc+1tη̂i+1|kc,i=0,,k,p(n)=n1/2, and K¯ is a positive constant which is a prespecified upper bound for the break number. If there exists a positive integer Kc such that(4.2) SSICjc(Kc+1)>SSICjc(Kc)j=1,,q0(q0+1)/2,(4.2) no further break exists in any dimension of the transformed factors. Let K̂1 be the smallest number such that (4.2) is satisfied, and stop the WBS-Cov algorithm in Section 2.3 when the number of breaks reaches K̂1. By using SSIC, we avoid choosing the tuning parameter ξnc in the WBS-Cov.

Similarly, we can also use the SSIC to obtain the estimated break number for the idiosyncratic error components. Let SSICi,je(k) be defined similarly to SSICjc(k) but with Zt,jF^ and Z¯t,jF^(k) replaced by(4.3) Zt,ijϵ̂:=ϵ̂itϵ̂jtandZ¯t,ijϵ̂(k)=1η̂i+1|keη̂i|kes=η̂i|ke+1η̂i+1|keZs,ijϵ̂(4.3)

for η̂i|ke+1tη̂i+1|ke,i=0,,k. Let K̂2 be the smallest number Ke such that(4.4) SSICi,je(Ke+1)>SSICi,je(Ke)(4.4) for all 1i,jd, and thus we obtain the estimated break number for the idiosyncratic error component. In the construction of the CUSUM statistic in the WSBS-Cov, we need to define the random scaling quantity σ̂l,u(i,j). In the numerical studies, we let σ̂l,u(i,j) be the differential error median absolute deviation which is defined byσ̂l,u(i,j)=medlt<u(|(ϵ̂t+1,iϵ̂t+1,jϵ̂tiϵ̂tj)medlt<u(ϵ̂t+1,iϵ̂t+1,jϵ̂tiϵ̂tj)|) in which medlt<u(·) denotes the sample median over the time interval [l,u) (e.g., Fryzlewicz Citation2014). An advantage of using the median absolute deviation is that it provides robust scale estimation.

When implementing the WBS-Cov and WSBS-Cov algorithms, we have to guarantee sufficient length of the random intervals to facilitate break detection. When l and u are too close, σ̂l,u(i,j) may be unbounded, leading to violation of Assumption 5 in Appendix A. Consequently, the CUSUM statistics on those small intervals would be smaller than the thresholding parameter so that no breaks could be detected. Therefore, we introduce a quantity Δn as in Barigozzi, Cho, and Fryzlewicz (Citation2018), to control the minimum length of the random intervals. Specifically, Algorithm 4 is used to produce a set of random intervals in WBS-Cov or WSBS-Cov.

Algorithm 4

Drawing random intervals

Input: n, Δn and Mn (Mn=Mnc or Mne)

Draw Mn pairs of random points uniformly from the set {1,2,,n4Δn};

For each pair, let lm be the smaller random point, and um be the larger one plus 4Δn.

Output: random intervals [lm,um] for m=1,,Mn

We set Mn=Mnc=Mne=400 in the numerical studies. By Algorithm 4, the random intervals are produced with length of at least 4Δn. In addition, we also trim each random interval to keep a distance of Δn to both boundaries, indicating that we only maximize the CUSUM statistics over the interval [lm+Δn,umΔn]. Note that Δn should be much smaller than κnc and κne. Thus, we choose Δn=(log2n)(0.25n6/7) following the discussion in Barigozzi, Cho, and Fryzlewicz (Citation2018).

A crucial issue in constructing the sparsified CUSUM statistic in the WSBS-Cov algorithm is the choice of the thresholding parameter ξne. Cho and Fryzlewicz (Citation2015) suggest choosing this parameter using the CUSUM statistic based on a stationary process (e.g., a simulated AR(1) process) under the null hypothesis of no breaks. However, in practical applications, for the high-dimensional time series data with latent breaks, it is often difficult to recover the underlying stationary structure. To address this problem, we propose an alternative approach to choose ξne: (i) pre-detect the breaks (e.g., combining the classical BS-Cov and SSIC without knowing ξne a priori) and remove the breaks from the high-dimensional covariance matrices through demeaning using the estimated breaks, that is, generate Zt,ijε^Z¯t,ijε^(K˜ne) as in (4.3), where K˜ne is the preliminary estimate of the break number obtained in the pre-detection of the idiosyncratic components; (ii) calculate the CUSUM statistics of the “stationary” process {Zt,ijε^Z¯t,ijε^(K˜ne)} for each pair (i, j) and then take the maximum as the chosen thresholding parameter ξne. This technique is more computationally intensive than any formula-based threshold selection procedures; however, we have found that it performs well in the numerical studies reported in Section 5.

5 Monte Carlo Simulation

In this section, we provide a simulation study to compare the finite-sample performance between the proposed methods and the method proposed by Barigozzi, Cho, and Fryzlewicz (Citation2018), and examine the performance of our methods in the weak factor structure. Additional simulation studies are given in Appendix F of the supplementary materials, where the proposed methods are compared with various other competing methods that we designed. Each of these methods differs from the method presented in our main text in only one step. Like ablation studies that are commonly conducted in deep learning research, they help to gain a better understanding of our methods. Since these alternative methods are not designed for practical use, nor are they theoretically justified, we keep them in the supplementary materials.

We consider the following factor model to generate data:(5.1) Xti=j=1rλij,tFtj+θϵti,i=1,,d,t=1,,n,(5.1) where θ>0 is used to control the signal-to-noise ratio. The replication number in each simulation cases is set to be R = 100. For the 100 simulated samples, we report the estimated number of break(s) as well as the accuracy measure for each break location ηk defined by(5.2) ACUk=100%×[1100·i=1100I(minl=1,,K̂(i)|η̂l(i)ηk|logn)],(5.2) for k=1,,K, where K denotes the break number K1 or K2, K̂(i) is the corresponding break number estimate in the ith simulated sample, ηk denotes either ηkc or ηke, and η̂l(i),l=1,,K̂(i), are the break point estimates in the ith simulated sample,

Example 5.1.

We use model (5.1) to generate the data in simulation, where θ=0.5, the number of factors is r = 5, the sample size is n = 400, and the dimension is d = 200. The factor process Ft=(Ft1,,Ft5) is generated from a multivariate normal distribution N5(0,ΣF) independently over t, where ΣF is a covariance matrix with one break specified as follows: for 1tη1c=133,ϕjF, the square root of the jth diagonal element of ΣF, is independently generated from a uniform distribution U(0.5,1.5), and the (i, j)-entry of ΣF is defined as ϕiFϕjF(0.5)|ij| for 1ij5; for η1c<tn, the (1,2) and (2,1)-entries of ΣF change from 0.5ϕ1Fϕ2F to 0.9ϕ1Fϕ2F, and ϕ5F is replaced by 1.3ϕ5F resulting in structural breaks in (5,j) and (j,5)-entries of ΣF. For 1tη2c=267, the factor loadings λij are independently generated from a uniform distribution U(1,1); whereas for η2c<tn, the factor loadings corresponding to the first two factors are regenerated by a new uniform distribution U(1,1). Therefore, there are two structural breaks in the covariance structure of the common components covering cases (i) and (ii) discussed in Section 2.1.

The idiosyncratic errors ϵt follow a multivariate normal distribution Nd(0,Σϵ) independently over t, where ϕj, the square root of the jth diagonal element of Σϵ, is generated from an independent uniform distribution U(0.5,1.5), and the (i, j)-entry of Σϵ is ϕiϕj(0.5)|ij| for 1ijd. There are three breaks, η1e=n/4=100,η2e=n/2=200 and η3e=3n/4=300, in the idiosyncratic error components. At each of the break points, we swap the orders of ϱ1ed/2 randomly selected pairs of elements of ϵt with ϱ1e chosen as 1, 0.5 and 0.1. Note that ϱ1e=0.1 indicates that the structural breaks are sparse in the high-dimensional error components, whereas ϱ1e=1 indicates that the breaks are dense.

reports the simulation result, where “W(S)BS-Cov” denotes our proposed methods and “BCF” denotes the method proposed by Barigozzi, Cho, and Fryzlewicz (Citation2018) which combines the wavelet-based transformation and the double-CUSUM method. We find that the proposed WBS-Cov method has better finite-sample performance in detecting breaks in the common components with more accurate estimated break number and the higher ACU, whereas “BCF” tends to neglect the first break and thus under-estimate the number of breaks. For the break detection in the idiosyncratic error components, when ϱ1e=1, the proposed WSBS-Cov method and “BCF” perform equally well. However, when ϱ1e=0.5 and 0.1, the WSBS-Cov method clearly outperforms “BCF.”

Table 1 Comparison of break detection results.

We next consider an example with breaks in the weak factor structure. When factor loadings for the transformed factor model in (2.4) satisfies μk(ΛΛ)=oP(d), we call the kth factor as a weak factor. Although our asymptotic framework (see Assumption 2(ii)) rules out this scenario, we provide the following simulation to examine the effect of weak factors to the proposed break detection methods.

Example 5.2.

We use model (5.1) with a setting similar to Freyaldenhoven (2021) to generate the data, where the number of factors is r = 9, θ=1.5, the sample size is n = 400, and the dimension is d = 200. The factor process Ft is generated from a multivariate normal distribution N9(0,I9) independently over t. The jth factor only affects dj elements of the d-dimensional observation vector, indicating that the factor loading matrix is sparse with dj nonzero entries in its jth column. The position of nonzero entries are randomly selected, and their values are drawn from N(1,1) independently. We set (d1,,d9)=(d,d0.85,d0.75,d2/3,d2/3,d0.6,d1/3,d1/4, log(d)).

We set two breaks for the common components: η1c=100 and η2c=300 and consider nine cases in the simulation. Case j represents a set-up in which breaks only occur on the jth column of the factor loading matrix. Specifically, independently for the three periods: 1t100,101t300, and 301t400, the position of nonzero entries in the jth factor loading vector is randomly selected with its value independently regenerated. For the idiosyncratic error components, as in Freyaldenhoven (2021), we let ϵt=Aϵet and generate et=(et1,,etd) byeti=ρeet1,i+(1ρe2)1/2vti,vti=ρvvt1,i+(1ρv2)1/2ue,ti,where ue,ti are independently drawn from N(0,1), and (ρe,ρv)=(0.3,0.1). We set two break points on Aϵ: η1e=150 and η2e=250. For 1t150, we first generate a tridiagonal matrix A˜ϵ with all diagonal entries being one and the other nonzero entries drawn from U(0.5,1.5), and then standardize A˜ϵ by letting each row multiply a constant so that the row sum of squares is one to define Aϵ. The matrix Aϵ is randomly regenerated in the same way for 151t250 and 251t400.

As shown by Freyaldenhoven (2021), the classic PCA estimators are consistent only for the “relevant” factors whose corresponding dj is of an order larger than d1/2. Hence, the number of relevant factors is 6 in the simulated weak factor model. By the factor model transformation mechanism in Proposition 2.1, we expect that the number of factors should be 8 for cases 1–6 (6 relevant factors plus 2 extra factors due to factor transformation accommodating breaks), and 6 in cases 7–9. reports the simulation results for the break detection. We find that the factor number is over-estimated in all cases. The break numbers and locations in both the common and idiosyncratic components are correctly estimated in general when breaks occur in the relevant factors which are relatively strong (cases 1–3). The break detection results are less accurate when there are breaks in loadings of relevant factors which are relatively weak (cases 4–6). In cases 7–9, since weak factors with breaks in the loadings are not selected as the relevant factors via the information criterion, the proposed WBS-Cov method cannot detect either of the two breaks in the common components.

Table 2 Detection results with breaks on weak factors.

6 An Empirical Application

In this section, we use the proposed methods to detect breaks in the contemporaneous covariance structure of the daily returns of S&P 500 constituents.

6.1 Break Detection in Covariances

The data are retrieved from Refinitiv Datastream database (formerly Thomson Reuters Datastream), covering the time period from January 01, 2000 to February 28, 2021. We consider 375 firms listed on the S&P 500 index over the entire time period. Let Xit denote the demeaned percentage price change (without reinvesting of dividends) of equity i at time t, where i=1,,375 and t=1,,5322. We fit the factor model (2.2) to the data with Xt=(X1t,,X375,t), allowing structural breaks in the covariance matrices of both the common and idiosyncratic components.

The information criterion introduced in Section 2.2 selects nine factors (with appropriate transformation), that is, q̂=9. We set Δn=20 and Mnc=Mne=1000 in the WBS-Cov and WSBS-Cov algorithms. Combined with SSIC, the WBS-Cov algorithm detects five breaks in the covariance matrix of the common components, and the WSBS-Cov algorithm detects three breaks in the idiosyncratic components. We plot the estimated break times on the series of the S&P 500 index and its returns in , where the red vertical lines denote breaks in the common components and the blue dotted vertical lines denote breaks in the idiosyncratic components. Among the five breaks in the common components, two occur during the period of a bearish market from 2000 to 2003, two occur in 2008 and 2009 during the global financial crisis, and one occurs recently due to the COVID-19 outbreak. Among the three breaks in the idiosyncratic errors, the first break occurs around the end of 2002, the second and third breaks occur in 2008 and 2009.

Fig. 1 The detected break times in the covariance structure of the 375 daily returns of the S&P 500 constituents from January 01, 2000 to February 28, 2021. The estimated break dates for common components are April 24, 2000, November 25, 2002, September 12, 2008, May 08, 2009 and March 06, 2020 (in red lines); and the estimated break dates for idiosyncratic components are August 06, 2002, September 11, 2008 and May 11, 2009 (in blue dotted lines).

Fig. 1 The detected break times in the covariance structure of the 375 daily returns of the S&P 500 constituents from January 01, 2000 to February 28, 2021. The estimated break dates for common components are April 24, 2000, November 25, 2002, September 12, 2008, May 08, 2009 and March 06, 2020 (in red lines); and the estimated break dates for idiosyncratic components are August 06, 2002, September 11, 2008 and May 11, 2009 (in blue dotted lines).

6.2 Calibration

We next provide a simulation using the factor model with calibrated parameters from the empirical study, and aim to further examine the proposed break detection methods in finite samples.

  • Generate a simulated dataset with n = 5322 and d = 375 as the real data.

  • Consider six sub-periods separated by the five estimated break dates in the common components (see the red lines in ) and use the PCA estimated factor loadings. The factors are drawn from a standard normal distribution independently over time and across dimension.

  • Divide the estimated idiosyncratic errors into four segments split by the three estimated breaks in idiosyncratic components (see the blue dotted lines in ), and separately estimate their sample covariance matrices. For each segment, generate idiosyncratic errors by a multivariate normal distribution independently over time with mean zero and covariance matrix being the sample covariance matrix.

The above generating mechanism guarantees the simulated covariance structure to be the same as that for the empirical data. As in the simulation study in Section 5, we repeat the simulation procedure 100 times and report the break detection results. For simplicity, we fix ξne=95.99 in WSBS-Cov, which is calculated by applying the method in Section 4 to the real data. reports the average break number estimates, the number of times that the estimated break is within a distance of log(n)8.6 to the true break, denoted as ACU, and the root-mean-squared distance of the closest estimated break to the true break, denoted as RMSE. Except for the first two breaks (April 24, 2000 and November 25, 2002) in the common components, ACUs are larger than 95% and RMSEs are smaller than 4, indicating that the break dates are accurately detected. The low ACUs and large RMSEs of the first two breaks may be due to the relatively weak break magnitude. The estimated break numbers reported in are slightly larger than the true break numbers (5 for the common components and 3 for the idiosyncratic components). In addition, we also fix q0=12 in the break detection procedures and obtain similar simulation results, showing that the proposed methods work when the factor number is over-estimated.

Table 3 Detection results on calibrated data.

6.3 Comparison with Observed Risk Factors

As a benchmark, we also download eight observable risk factors (over January 01, 2000–February 28, 2021) from Kenneth R. French’s data libraryFootnote1 and study breaks in their covariance structure. These risk factors include the excess return (MKT), the size factor (SMB), the value factor (HML), the profitability factor (RMW) and the investment factor (CMA) introduced by Fama and French (Citation1993, Citation2015); the Momentum factor (MOM) proposed by Carhart (Citation1997) in a four-factor model; and the short term reversal factor (STRev) and the long-term reversal factor (LTRev) which are less famous but equally important (see De Bondt and Thaler Citation1987; Lehmann Citation1990). They are often used as observable proxies of latent factors in the approximate factor model. We plot patterns of observed factors in , from which we can see that they vary drastically during the global financial crisis and the COVID-19 outbreak. Let Zt=(Z1t,,Z8t) denote the vector containing the eight risk factors, t=1,,5322. As the dimension of Zt is low, there is no need to impose the approximate factor model framework. Using the WBS-Cov algorithm (without any latent factor structure) in Section 2.3, we detect five break dates: April 19, 2001, July 01, 2008, September 12, 2008, May 13, 2009 and March 06, 2020. Note that the first two estimated break dates (April 19, 2001 and July 01, 2008) are quite different from the first two break dates (April 24, 2000 and November 25, 2002) in the common components obtained by applying the proposed WBS-Cov to the estimated factors, see . shows the canonical correlation between the estimated factors and observed factors for the entire sample and six subsamples divided by the estimated break dates in the common components. p-values are reported in the brackets using the F-approximations of Rao’s test statistic related to Wilks’ Lambda (see Rao Citation1973). Some of the minimum canonical correlations are small and insignificant, indicating that the space spanned by the eight observed factors may be not the same as that spanned by the nine PCA estimated factors.

Fig. 2 Variation of eight commonly used risk factors from January 01, 2000 to February 28, 2021. The estimated break dates are April 19, 2001, July 01, 2008, September 12, 2008, May 13, 2009 and March 06, 2020 (in red lines).

Fig. 2 Variation of eight commonly used risk factors from January 01, 2000 to February 28, 2021. The estimated break dates are April 19, 2001, July 01, 2008, September 12, 2008, May 13, 2009 and March 06, 2020 (in red lines).

Table 4 Canonical correlation between estimated factors and observed factors.

To further analyze whether breaks in the common components can be explained by the observed factors, we run regression of stock returns on the eight observed factors and detect whether breaks exist in covariances of the regression residuals. We apply the factor model (2.2) to the residual vectors and use the information criterion to select six factors. Using WBS-Cov, we detect six breaks in the common components (of the residual vectors): April 20, 2000, November 22, 2002, July 07, 2008, September 26, 2008, May 08, 2009 and March 06, 2020, and one break in the idiosyncratic components which is August 12, 2002. Note that the first two breaks in the common components April 20, 2000 and November 22, 2002 are very close to April 24, 2000 and November 25, 2002, respectively, which are reported in . Hence, we may conclude that break detection of a low-dimensional observed risk factor cannot fully capture structural breaks in covariance of price changes in the entire stock market.

7 Conclusion

In this article, we detect and estimate multiple structural breaks in the contemporaneous covariance structure of the high-dimensional time series vector. We allow high correlation between the time series variables by imposing the approximate factor model, a commonly-used framework in economics and finance. The structural breaks occur in either the common or idiosyncratic error components. When there are breaks in the factor loadings or factor numbers, we transform the factor model in order to make the classic PCA method applicable and subsequently obtain estimates of the transformed factors and approximation of the idiosyncratic errors. The construction of the CUSUM quantities is based on the second-order moments of the estimated factors and errors. The WBS algorithm introduced by Fryzlewicz (Citation2014), combined with SSIC, is extended to the multivariate setting to detect breaks in covariance structure of the common components, estimating the location and number of breaks. For the idiosyncratic error components, we combine the WBS-Cov algorithm and the sparsified CUSUM statistic to detect structural breaks (which can be either sparse or dense), extending the SBS algorithm proposed by Cho and Fryzlewicz (Citation2015). Under some regularity conditions, we show consistency of the estimated break numbers and derive the near-optimal rates (up to a fourth-order logarithmic factor) for the estimated breaks. Monte Carlo simulation studies are conducted to examine the numerical performance of the proposed WBS-Cov and WSBS-Cov methods in finite samples and its comparison with the method introduced by Barigozzi, Cho, and Fryzlewicz (Citation2018). In addition, we examine performance of the proposed methods when breaks are on weak factors. In the empirical application to the return series of S&P 500 constituent, we detect five break in the common components and three breaks in the idiosyncratic error components by the developed methods.

Supplemental material

Supplemental Material

Download PDF (416.4 KB)

Acknowledgments

The authors would like to thank an Associate Editor and two reviewers for the constructive comments, which helped to improve the article.

Supplementary Materials

The supplementary materials contains the detailed proofs of the main asymptotic results, a simple motivating example for the factor model transformation as well as additional simulation studies.

Additional information

Funding

The second author’s research was partly supported by the BA/Leverhulme Small Research Grant funded by the Leverhulme Trust (No. SRG1920/100603). The third author’s research was supported by the Engineering and Physical Sciences Research Council (No. EP/L014246/1).

Notes

References

  • Alessi, L., Barigozzi, M., and Capasso, M. (2010), “Improved Penalization for Determining the Number of Factors in Approximate Factor Models,” Statistics and Probability Letters, 80, 1806–1813. DOI: 10.1016/j.spl.2010.08.005.
  • Andrews, D. W. (1993), “Tests for Parameter Instability and Structural Change with Unknown Change Point,” Econometrica, 61, 821–856. DOI: 10.2307/2951764.
  • Anderson, T. W. (2003), An Introduction to Multivariate Statistical Analysis, Wiley Series in Probability and Statistics (3rd ed.), New York: Wiley.
  • Aston, J. A., and Kirch, C. (2018), “High Dimensional Efficiency with Applications to Change Point Tests,” Electronic Journal of Statistics, 12, 1901–1947. DOI: 10.1214/18-EJS1442.
  • Aue, A., Hörmann, Horváth, L., and Reimherr, M. (2009), “Break Detection in the Covariance Structure of Multivariate Time Series Models,” Annals of Statistics, 37, 4046–4087.
  • Bai, J. (1997), “Estimating Multiple Breaks on at a Time,” Econometric Theory, 13, 315–352. DOI: 10.1017/S0266466600005831.
  • Bai, J., Han, X., and Shi, Y. 2020, “Estimation and Inference of Change Points in High-Dimensional Factor Models,” Journal of Econometrics, 219, 66–100 DOI: 10.1016/j.jeconom.2019.08.013.
  • Bai, J., and Li, K. (2012), “Statistical Analysis of Factor Models of High Dimension,” Annals of Statistics, 40, 436–465.
  • Bai, J., and Ng, S. (2002), “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70, 191–221. DOI: 10.1111/1468-0262.00273.
  • Bai, J., and Ng, S- (2006), “Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions,” Econometrica, 74, 1135–1150.
  • Baltagi, B., Kao, C., and Wang, F. (2017), “Identification and Estimation of a Large Factor Model with Structural Instability,” Journal of Econometrics, 197, 87–100. DOI: 10.1016/j.jeconom.2016.10.007.
  • Baltagi, B., Kao, C., and Wang, F. (2021), “Estimating and Testing High Dimensional Factor Models with Multiple Structural Changes,” Journal of Econometrics, 220, 349–365.
  • Barigozzi, M., Cho, H., and Fryzlewicz, P. (2018), “Simultaneous Multiple Change-Point and Factor Analysis for High-Dimensional Time Series,” Journal of Econometrics, 206, 187–225. DOI: 10.1016/j.jeconom.2018.05.003.
  • Bickel, P., and Levina, E. (2008a), “Covariance Regularization by Thresholding,” Annals of Statistics, 36, 2577–2604.
  • Bickel, P., and Levina, E(2008b), “Regularized Estimation of Large Covariance Matrices,” Annals of Statistics, 36, 199–227.
  • Breitung, J., and Eickmeier, S. (2011), “Testing for Structural Breaks in Dynamic Factor Models,” Journal of Econometrics, 163, 71–84. DOI: 10.1016/j.jeconom.2010.11.008.
  • Cai, T. T., and Liu, W. (2011), “Adaptive Thresholding for Sparse Covariance Matrix Estimation,” Journal of the American Statistical Association, 106, 672–684. DOI: 10.1198/jasa.2011.tm10560.
  • Cai, T. T., Ren, Z., and Zhou, H. (2016), “Estimating Structured High-Dimensional Covariance and Precision Matrices: Optimal Rates and Adaptive Estimation,” Electronic Journal of Statistics, 10, 1–59.
  • Carhart, M. M. (1997), “On Persistence in Mutual Fund Performance,” Journal of Finance, 52, 57–82. DOI: 10.1111/j.1540-6261.1997.tb03808.x.
  • Chamberlain, G., and Rothschild, M. (1983), “Arbitrage, Factor Structure and Mean-Variance Analysis in Large Asset Markets,” Econometrica, 51, 1305–1324. DOI: 10.2307/1912276.
  • Chen, J., Li, D., Linton, O., and Lu, Z. (2018), “Semiparametric Ultra-High Dimensional Model Averaging of Nonlinear Dynamic Time Series,” Journal of the American Statistical Association, 113, 919–932. DOI: 10.1080/01621459.2017.1302339.
  • Chen, L., Dolado, J. J., and Gonzalo, J. (2013), “Detecting Big Structural Breaks in Large Factor Models,” Journal of Econometrics, 180, 30–48. DOI: 10.1016/j.jeconom.2014.01.006.
  • Chen, X., Xu, M., and Wu, W. (2013), “Covariance and Precision Matrix Estimation for High-Dimensional Time Series,” Annals of Statistics, 41, 2994–3021.
  • Chen, Z., and Leng, C. (2016), “Dynamic Covariance Models,” Journal of the American Statistical Association, 111, 1196–1207. DOI: 10.1080/01621459.2015.1077712.
  • Cheng, X., Liao, Z., and Schorfheide, F. (2016), “Shrinkage Estimation of High-Dimensional Factor Models with Structural Instabilities,” Review of Economic Studies, 83, 1511–1543. DOI: 10.1093/restud/rdw005.
  • Cho, H. (2016), “Change-Point Detection in Panel Data via Double CUSUM Statistic,” Electronic Journal of Statistics, 10, 2000–2038. DOI: 10.1214/16-EJS1155.
  • Cho, H., and Fryzlewicz P. (2012), “Multiscale and Multilevel Technique for Consistent Segmentation of Nonstationary Time Series,” Statistica Sinica, 22, 207–229. DOI: 10.5705/ss.2009.280.
  • Cho, H., and Fryzlewicz P. (2015), “Multiple Change-Point Detection for High-Dimensional Time Series via Sparsified Binary Segmentation,” Journal of the Royal Statistical Society, Series B, 77, 475–507.
  • Cho, H. (2016), “Change-Point Detection in Panel Data via Double CUSUM Statistic,” Electronic Journal of Statistics, 10, 2000–2038. DOI: 10.1214/16-EJS1155.
  • De Bondt, W. F., and Thaler, R. H. (1987), “Further Evidence on Investor Overreaction and Stock Market Seasonality,” Journal of Finance, 42, 557–581. DOI: 10.1111/j.1540-6261.1987.tb04569.x.
  • Duan, J., Bai, J., and Han, X. (2022), “Quasi-Maximum Likelihood Estimation of Break Point in High-Dimensional Factor Mmodels,” Journal of Econometrics, forthcoming. DOI: 10.1016/j.jeconom.2021.12.011.
  • El Karoui, N. (2008a), “Operator Norm Consistent Estimation of Large-Dimensional Sparse Covariance Matrices,” Annals of Statistics, 36, 2717–2756.
  • El Karoui, N. - (2008b), “Spectrum Estimation for Large Dimensional Covariance Matrices Using Random Matrix Theory,” Annals of Statistics, 36, 2757–2790.
  • Fama, E. F., and French, K. (1992), “The Cross-Section of Expected Stock Returns,” Journal of Finance, 47, 427–465. DOI: 10.1111/j.1540-6261.1992.tb04398.x.
  • Fama, E. F., and French, K. (1993), “Common Risk Factors in the Returns on Stocks and Bonds,” Journal of Financial Economics, 33, 3–56.
  • Fama, E. F., and French, K. (2015), “A Five-Factor Asset Pricing Model,” Journal of Financial Economics, 116, 1–22.
  • Fan, J., Liao, Y., and Liu, H. (2016), “An Overview on the Estimation of Large Covariance and Precision Matrices,” Econometrics Journal, 19, 1–32.
  • Fan, J., Liao, Y., and Mincheva, M. (2013), “Large Covariance Estimation by Thresholding Principal Orthogonal Complements,” (with discussion), Journal of the Royal Statistical Society, Series B, 75, 603–680. DOI: 10.1111/rssb.12016.
  • Freyaldenhoven, S. (2022), “Factor Models with Local Factors – Determining the Number of Relevant Factors,” Journal of Econometrics, 229, 80–102. DOI: 10.1016/j.jeconom.2021.04.006.
  • Fryzlewicz, P. (2014), “Wild Binary Segmentation for Multiple Change-Point Detection,” Annals of Statistics, 42, 2243–2281.
  • Fryzlewicz, P (2020), “Detecting Possibly Frequent Change-Points: Wild Binary Segmentation 2 and Steepest-Drop Model Selection,” Journal of the Korean Statistical Society, 49, 1027–1070.
  • Han X., and Inoue, A. (2015), “Tests for Parameter Instability in Dynamic Factor Models,” Econometric Theory, 31, 1117–1152. DOI: 10.1017/S0266466614000486.
  • Hariz, S. B., Wylie, J. J., and Zhang, Q. (2007), “Optimal Rate of Convergence for Nonparametric Change-Point Estimators for Nonstationary Sequences,” Annals of Statistics, 35, 1802–1826.
  • Huang, N., and Fryzlewicz, P. (2019), “NOVELIST Estimator of Large Correlation and Covariance Matrices and their Inverses,” Test, 28, 694–727. DOI: 10.1007/s11749-018-0592-4.
  • Jirak, M. (2015), “Uniform Change Point Tests in High Dimension,” Annals of Statistics, 43, 2451–2483.
  • Killick, R., Fearnhead, P., and Eckley, I. (2012), “Optimal Detection of Changepoints with a Linear Computational Cost,” Journal of the American Statistical Association, 107, 1590–1598. DOI: 10.1080/01621459.2012.737745.
  • Korkas, K. K., and Fryzlewicz, P. (2017), “Multiple Change-Point Detection for Non-stationary Time Series Using Wild Binary Segmentation,” Statistica Sinica, 27, 287–311. DOI: 10.5705/ss.202015.0262.
  • Korostelev, A.P. (1988), “On Minimax Estimation of a Discontinuous Signal,” Theory of Probability & Its Applications, 32, 727–730.
  • Lam, C., and Fan, J. (2009), “Sparsity and Rates of Convergence in Large Covariance Matrix Estimation,” Annals of Statistics, 37, 4254–4278.
  • Ledoit, O., and Wolf, M. (2004), “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices,” Journal of Multivariate Analysis, 88, 365–411. DOI: 10.1016/S0047-259X(03)00096-4.
  • Lehmann, B. N. (1990), “Fads, Martingales, and Market Efficiency,” Quarterly Journal of Economics, 105, 1–28. DOI: 10.2307/2937816.
  • Li, J., Todorov, V., Tauchen, G., and Lin, H. (2019), “Rank Tests at Jump Events,” Journal of Business & Economic Statistics, 37, 312–321.
  • Ma, S., and Su, L. (2018), “Estimation of Large Dimensional Factor Models with an Unknown Number of Breaks,” Journal of Econometrics, 207, 1–29. DOI: 10.1016/j.jeconom.2018.06.019.
  • Pourahmadi, M. (2013), High-Dimensional Covariance Estimation: With High-Dimensional Data, Hoboken, NJ: Wiley.
  • Rao, C. R (1973), Linear Statistical Inference and Its Applications, (2nd ed.), New York: Wiley.
  • Rothman, A. J., Levina, E., and Zhu, J. (2009), “Generalized Thresholding of Large Covariance Matrices,” Journal of the American Statistical Association, 104, 177–186. DOI: 10.1198/jasa.2009.0101.
  • Safikhani, A., and Shojaie, A. (2022), “Joint Structural Break Detection and Parameter Estimation in High-Dimensional Non-stationary VAR Models,” Journal of the American Statistical Association, 537, 251–264. DOI: 10.1080/01621459.2020.1770097.
  • Stock, J. H., and Watson, M. W. (2002), “Forecasting Using Principal Components from a Large Number of Predictors,” Journal of the American Statistical Association 97, 1167–1179. DOI: 10.1198/016214502388618960.
  • Venkatraman, E. S. (1992), “Consistency Results in Multiple Change-Point Problems,” Technical Report No. 24, Department of Statistics, Stanford University.
  • Vostrikova L. Y. (1981), “Detecting Disorder in Multidimensional Random Processes,” Soviet Mathematics Doklady, 24, 55–59.
  • Wang, D., Yu, Y., and Rinaldo, A. (2021), “Optimal Covariance Change Point Localization in High Dimensions,” Bernoulli, 27, 554–575. DOI: 10.3150/20-BEJ1249.
  • Wang, D., Yu, Y., Rinaldo, A., and Willett, R. (2019), “Localizing Changes in High-Dimensional Vector Autoregressive Processes,” Working paper. Available at https://arxiv.org/abs/1909.06359.
  • Wang, T., and Samworth, R. J. (2018), “High Dimensional Change Point Estimation via Sparse Projection,” Journal of the Royal Statistical Society, Series B, 80, 57–83. DOI: 10.1111/rssb.12243.
  • Wu, W., and Pourahmadi, M. (2003), “Nonparametric Estimation of Large Covariance Matrices of Longitudinal Data,” Biometrika, 90, 831–844. DOI: 10.1093/biomet/90.4.831.
  • Yu, M., and Chen, X. (2021), “Finite Sample Change Point Inference and Identification for High-Dimensional Mean Vectors,” Journal of the Royal Statistical Society, Series B, 83, 247–270. DOI: 10.1111/rssb.12406.

Appendix A:

Technical Assumptions

The following assumptions are needed to establish the asymptotic theorems in Sections 3.1 and 3.2. Some of the conditions (such as the moment conditions) may be not the weakest possible but can be relaxed at the cost of more lengthy proofs.

Assumption 1.

  1. For each k=1,,K1+1, the process {(Ft,k,ϵt)} is α-mixing dependent with the mixing coefficient αk(·) satisfyingmax1kK1+1αk(s)ρs,0<ρ<1.

  2. For k=1,,K1+1 and j=1,,d,E[Ft,k]=0,E[ϵt]=0 and E[ϵtjFt,k]=0. In addition,max1kK1+1maxηk1c+1tηkcE[||Ft,k||2δF]<,max1tnmax1jdE[|ϵtj|δϵ]<,

where δF>2 and δϵ>2.

Assumption 2.

  1. For k=1,,K1+1, the covariance matrix cov(Ft,k)=ΣF,k of the latent factors when ηk1c+1tηkc, is rk×rk positive definite. The positive integers K1 and rk, k=1,,K1+1, are all bounded.

  2. Letting λk,j0 be the jth factor loading vector of Λk0 defined in (2.2), ||λk,j0||2 is bounded from above uniformly over j=1,,d and k=1,,K1+1. In addition, letting Λ¯=[Λ10,,ΛK1+10] which is a d×k=1K1+1rk matrix, the column rank of Λ¯ is q0 and there exist positive constants μ¯ and μ¯ such that

0<μ¯μq0(1dΛ¯Λ¯)μ1(1dΛ¯Λ¯)μ¯<.(A.1)

Assumption 3.

  1. There exists a positive constant ι0 such that

max1kK1+1max1tnE[j=1dλk,j0ϵtj2δϵ]ι0·dδϵ/2(A.2)and

max1s,tnE[|j=1d{ϵsjϵtjE[ϵsjϵtj]}|δϵ]ι0·dδϵ/2.(A.3)

  1. There exists a positive constant ι1>0 such that

max1jdmax1tnE[exp{ι1ϵtj2}]<,max1kK1+1maxηk1c+1tηkcE[exp{ι1||Ft,k||22}]<.(A.4)

Assumption 4.

  1. There exist positive constants ι2, ι3, and ϑ such thatι2log5+ϑ(d)nι3dδ/(δ+4),

where δ=δFδϵ.

  1. Letting κnc=min1kK1+1(ηkcηk1c),ω¯nc=1d2·min1kK1Σk+10(Λ,F)Σk0(Λ,F)F2

and ω¯nc=1d2·max1kK1Σk+10(Λ,F)Σk0(Λ,F)F2, we haveκncn,ω¯ncω¯nc·κncω¯nclog4n,Mnc, where Mnc is the number of random intervals drawn in the WBS-Cov algorithm.
  1. Letting κne=min1kK2+1(ηkeηk1e),ω¯ne=min1kK2min(i,j)Ik|σk+1|i,jeσk|i,je|2,ω¯ne=max1kK2max(i,j)Ik|σk+1|i,jeσk|i,je|2,

where Ik={(i,j):σk+1|i,jeσk|i,je,1i,jd} and σk|i,je is the (i, j)-entry of Σk0(ϵ), we haveω¯neω¯ne,κneω¯nelog4(nd)·(κnen)4,[1(κne/(3n))2]Mne0,

where Mne is the number of random intervals drawn in the WSBS-Cov algorithm.

Assumption 5.

There exist 0<σ¯<σ¯< such thatσ¯min(l,u):1l<unmin1i,jdσ̂l,u(i,j)max(l,u):1l<unmax1i,jdσ̂l,u(i,j)σ¯with probability approaching one, where σ̂l,u(i,j) is defined in (2.12). In addition, σ̂l,u(i,j)=σ̂l,u(j,i).

The assumption of mixing dependence on the latent common factor and idiosyncratic error process in Assumption 1(i) is very mild and covers some commonly-used vector time series models. Note that the α-mixing dependence condition allows the underlying process to be nonstationary with time-varying covariance structure (e.g., the piece-wise stationary time series process with breaks in the covariance matrix). The moment conditions in Assumptions 1(ii) and 3 are crucial to consistently estimate the factors and factor loadings (with rotation) in the transformed factor model (2.4).

Assumption 2 contains some fundamental conditions for the approximate factor model (2.2), which can be seen as a generalized version of those for the stable factor model without breaks (e.g., Bai and Ng Citation2002; Fan, Liao, and Mincheva Citation2013). These conditions, together with κncn in Assumption 4(ii), indicate that there exists a nonsingular limiting matrix for 1nt=1nFtFt⋆⊺ and Λ is of full column rank (see Proposition 3.1), which are essential to derive the convergence result for the PCA estimated factors and factor loadings. In addition, Assumption 2 also guarantees that ω¯nc=O(1).

The moment conditions (A.2) and (A.3) in Assumption 3(i) imply that the idiosyncratic errors ϵtj are allowed to be weakly dependent over j. Similar high-level moment conditions can be found in Assumption 4 in Bai and Ng (Citation2002) and Assumption 3.4 in Fan, Liao, and Mincheva (Citation2013). Assumption 3(ii) is similar to the exponential-tail condition commonly used in high-dimensional data analysis and is needed to allow the dimension d to be divergent at an exponential rate of n.

Assumption 4(i) indicates that both d and n diverge to infinity jointly, which is a typical large panel data setting necessary for consistent estimates of the rotated factors and factor loadings. A very similar assumption is also used in Theorem 1 of Fan, Liao, and Mincheva (Citation2013). In particular, Assumption 4(i) shows that d can be divergent to infinity at an exponential rate of n, further relaxing the corresponding condition assumed in Barigozzi, Cho, and Fryzlewicz (Citation2018) and Wang, Yu, and Rinaldo (Citation2021). Although the restriction nι3dδ/(δ+4) indicates that n is of smaller asymptotic order than d, our methodology is still applicable if n is much larger than d by switching the role of n and d and the role of the factor loadings and common factors in the PCA estimation (see sec. 3 in Bai and Li 2012). Assumption 4(ii) is mainly used for the asymptotics of WBS-Cov in the common components, and would be automatically satisfied when ι4ω¯ncω¯ncι5 and Mnc=logn, where ι4<ι5 are two positive constants. Assumption 4(iii) is mainly used to derive the WSBS-Cov asymptotic theory in the idiosyncratic error components. The restriction ω¯neω¯ne is only imposed to facilitate the proofs. The restrictions on ω¯nc and ω¯ne are consistent with the argument in Andrews (Citation1993) and Bai (Citation1997) that the break points can be identified only if the size of breaks, that is, (ω¯nc)1/2 and (ω¯ne)1/2 in our article, is asymptotically larger than n1/2. Assumption 4(iii) also indicates that there might be more frequent breaks in the idiosyncratic error components than in the common components, and the restriction κneω¯nelog4(nd)·(κnen)4 would be automatically satisfied when κnenΘ and (ω¯nen5Θ4)/log4(nd) with Θ(4/5,1]. In contrast, the minimum length κnc of the subintervals separated by break points in the common components is of order n. Assumption 5 requires the random normalizers in the WSBS-Cov to be uniformly bounded away from zero and infinity with probability approaching one.

In Assumption 2(i), we assume that K1 and rk, k=1,,K1, are all bounded, which indicates that q0, the number of transformed factors in (2.4), is upper bounded by Proposition 2.1. This assumption facilitates proofs of the consistent estimation theory for the transformed factors and factor loadings and consistent break estimation theory. It would be an interesting extension by allowing the break numbers K1 and K2 to grow slowly as n. In this case, q0 would be divergent as well, and the condition of κncn would be violated. Consequently, we need to amend the estimation and break detection techniques, and modify some technical assumptions (such as Assumption 4) and the asymptotic proofs. We will further explore this in our future study.