Full article: Spatial-sign-based high-dimensional white noises test

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In this study, we explore the problem of hypothesis testing for white noise in high-dimensional settings, where the dimension of the random vector may exceed the sample sizes. We introduce a test procedure based on spatial-sign for high-dimensional white noise testing. This new spatial-sign-based test statistic is designed to emulate the test statistic proposed by Paindaveine and Verdebout [(2016). On high-dimensional sign tests. Bernoulli, 22(3), 1745–1769.], but under a more generalized scatter matrix assumption. We establish the asymptotic null distribution and provide the asymptotic relative efficiency of our test in comparison with the test proposed by Feng et al. [(2022). Testing for high-dimensional white noise. arXiv:2211.02964.] under certain specific alternative hypotheses. Simulation studies further validate the efficiency and robustness of our test, particularly for heavy-tailed distributions.

Keywords:

1. Introduction

In this paper, we consider testing for white noise or serial correlation, which is a fundamental problem in statistical inference. For univariate time series, the famous Box-Pierce portmanteau test and its variations are very popular due to their convenience in practical application (Li, Citation2003; Lütkepohl, Citation2005). Many efforts have been devoted to extending those methods for testing multivariate time series, such as Hosking (Citation1980) and Li and McLeod (Citation1981). Recently, high-dimensional time series data frequently appear in many applications, including finance and econometrics, biological and environmental research, etc., where the dimensions of the time series is comparable or even larger than the observed length of the time series. In this case, the above traditional white noise tests can not directly apply for high-dimensional data.

Recently, there are two types of omnibus tests proposed to deal with high-dimensional white noise test. One is the max-type test. Chang et al. (Citation2017) proposed a test statistic using the maximum absolute autocorrelation and cross-correlations of the component series. Tsay (Citation2020) proposed a rank-based max-type test using the Spearman rank correlation. Chen et al. (Citation2022) extended Tsay (Citation2020)'s work to other types rank-based correlations, such as Kendall's tau correlation and Hoeffding's D statistic, etc. As known to all, the max-type tests perform well for the sparse alternatives where only a few auto-correlations are nonzero and large, but perform less powerful for the dense alternatives where there are many small nonzero auto-correlations. So researchers proposed sum-type tests for the high-dimensional white noise test. Li et al. (Citation2019) proposed a test statistic by using the sum of the squared singular values of several lagged sample autocovariance matrices. Feng et al. (Citation2022) proposed a new sum-type test statistic by excluding some terms from the test statistic proposed by Li et al. (Citation2019) and showed that it has better size performance. However, the above two sum-type tests are both based on the independent component model, which only allows the underlying distribution of the time series to be light tailed. Unfortunately, the assumption of light-tailed distribution may be no appropriate for many applications, such as stock security returns. Thus, we need to construct a robust high-dimensional white noise test procedure for the heavy tailed distributions.

The classic spatial-sign-based procedures are very robust and efficient in traditional multivariate analysis. See Oja (Citation2010) for an overview. Recently, many literatures show that the spatial-sign-based procedures also perform very well in high-dimensional settings. Wang et al. (Citation2015), Feng and Sun (Citation2016) and Feng et al. (Citation2021) proposed some spatial-sign-based test procedures for the high-dimensional one sample locationproblem. Feng et al. (Citation2016) and Huang et al. (Citation2023) considered high-dimensional two sample location problem. Zou et al. (Citation2014) and Feng and Liu (Citation2017) also extended the spatial-sign-based method to the high-dimensional sphericity test. Some spatial-sign-based test procedures for high-dimensional alpha test in factor pricing model were proposed by Liu et al. (Citation2023), Zhao et al. (Citation2022) and Zhao (Citation2023). In an important work, Paindaveine and Verdebout (Citation2016) proposed a spatial-sign-based test for i id.ness against serial dependence. However, they assume that the random vectors have independent spherical directions, which is too limited in applications. In practice, there is always some correlations between the random vectors. So we propose a new spatial-sign-based test procedure for the high-dimensional white noise test in this article. Under the elliptical symmetric distribution assumption, we establish the asymptotical normality of the proposed test statistic under the null hypothesis and a special alternative hypothesis. We also show that the asymptotical relative efficiency of our method with respect to the test proposed by Feng et al. (Citation2022) is equivalent to the corresponding asymptotical relative efficiency of spatial-sign-based method with respect to the least-square based procedures in high-dimensional settings (Feng & Sun, Citation2016; Liu et al., Citation2023; Wang et al., Citation2015; Zhao, Citation2023). Simulation studies also demonstrate the superiority of our method for heavy-tailed distributions.

This paper is organized as follows. In Section 2, we introduce our proposed spatial-sign-based test procedure for high-dimensional white noise test and establish the theoretical results. Simulation studies are showed in Section 3. An application involving real data are presented in Section 4. The conclusion and additional discussion are provided in Section 5. All technical details are collected in Section 6.

2. Test procedure

Let $ε_{1}, \dots, ε_{n}$ be a p-dimensional weakly stationary time series with a mean of zero. We consider the following testing problem: (1) $H_{0} : {ε_{t}} is white noise v s H_{1} : {ε_{t}} is not white noise,$ (1) where the dimension of time series p is comparable to or even greater than the sample size n. In this context, we define a time series ${ε_{t}}$ as white noise if the elements $ε_{t}$ are all independent and identically distributed. This definition differs from those provided in Shekhar et al. (Citation2003) and Cai and Kwan (Citation2022). Under the null hypothesis, $E (ε_{t} ε_{t + k}^{⊤}) = 0$ . So, Li et al. (Citation2019) proposed the following test statistic: $G_{H} = \sum_{h = 1}^{H} tr ({\hat{S}}_{h}^{⊤} {\hat{S}}_{h}), {\hat{S}}_{h} = \frac{1}{n} \sum_{t = h + 1}^{n} ε_{t} ε_{t - h}^{⊤} .$ They established the asymptotical normality of $G_{H}$ by random matrix theory. Feng et al. (Citation2022) removed the diagonal elements $ε_{i}^{⊤} ε_{i} ε_{i + h}^{⊤} ε_{i + h}$ from the summation and proposed the following test statistic: $T_{FLM} = \frac{1}{n (n - 1)} \sum_{h = 1}^{H} \underset{s \neq t}{\sum \sum} ε_{t}^{⊤} ε_{s} ε_{t + h}^{⊤} ε_{s + h} .$ They also established the asymptotical normality of $T_{FLM}$ by the martingale central limit theorem. Both the above two tests need the assumption of an independent component model, that is, $ε_{t} = S^{1 / 2} z_{t}$ and $z_{t} = (z_{t 1}, \dots, z_{tp})^{⊤}$ is a sequence of independent random vectors of dimensions p with independent components. However, a common drawback of the independent component model is their inability to handle many well-known heavy-tailed distributions, such as the multivariate student t and the mixture of multivariate normal distributions. Thus, we need to propose a robust and efficient test procedure for heavy-tailed distributions.

Under the assumption that $ε_{t}$ have independent spherical directions, Paindaveine and Verdebout (Citation2016) proposed a standardization test statistic (2) $T_{PV} = \frac{\sqrt{2 p^{2}}}{\sqrt{H}} \sum_{h = 1}^{H} \frac{1}{n - h} \sum_{h + 1 \leq s < t \leq n} U_{s - h}^{⊤} U_{t - h} U_{s}^{⊤} U_{t},$ (2) where $U_{t} = U (ε_{t})$ and $U (x) = \frac{x}{| | x | |} I (x \neq 0)$ . They showed that $T_{PV} \overset{d}{\to} N (0, 1)$ as $n, p \to \infty$ under the null hypothesis. However, the assumption of independent spherical directions always does not hold in practice. In addition, they did not give the power function of $T_{PV}$ under the alternative hypothesis. So we need to establish the theoretical results of the spatial-sign-based test statistic under more general scatter matrix assumption.

We consider the following test statistic: (3) $T_{S} = \sum_{h = 1}^{H} \frac{1}{n - h} \sum_{h + 1 \leq s < t \leq n} U_{s - h}^{⊤} U_{t - h} U_{s}^{⊤} U_{t},$ (3) which is mimic to the test statistic (Equation2(2) $T_{PV} = \frac{\sqrt{2 p^{2}}}{\sqrt{H}} \sum_{h = 1}^{H} \frac{1}{n - h} \sum_{h + 1 \leq s < t \leq n} U_{s - h}^{⊤} U_{t - h} U_{s}^{⊤} U_{t},$ (2) ). Next, we will show that the asymptotic variance of $T_{S}$ under the null hypothesis is $\frac{H}{2} {tr}^{2} (Ω^{2})$ and $Ω = E (U_{t} U_{t}^{⊤})$ , which is equal to $\frac{H}{2} p^{2}$ if $ε_{t}$ have independent spherical directions.

We need the following conditions.

(C1)	(Error Distribution) The error vectors $ε_{1}, \dots, ε_{n}$ are i.i.d. from the p-variate mean zero elliptical distribution with probability density function: $det (Ξ)^{- 1 / 2} g (‖ Ξ^{- 1 / 2} ε ‖), ε \in R^{p},$ where $Ξ$ is a positive-definite scatter matrix.
(C2)	(Covariance Matrix) $tr (Σ^{4}) = o ({tr}^{2} (Σ^{2}))$ and $\frac{{tr}^{4} (Σ)}{{tr}^{2} (Σ^{2})} \exp {- \frac{{tr}^{2} (Σ)}{128 N λ_{max}^{2} (Σ)}} \to 0$ , where $Σ = Cov (e_{t}) ≐ (σ_{ij})_{1 \leq i, j \leq N}$ and $λ_{max} (Σ)$ is the largest eigenvalue of $Σ$ .

Under the Condition (C1), $ε_{i}$ can be decomposed as $Ξ^{1 / 2} R_{i} u_{i}$ where $u_{i}$ is a random vector uniformly distributed on the unit sphere in $R^{p}$ and $R_{i}$ is a nonnegative random variable independent of $u_{i}$ . The covariance matrix can be written as $Σ = p^{- 1} E (R_{i}^{2}) Ξ$ . Condition (C2) is the same as the Conditions (C1) and (C2) in Wang et al. (Citation2015). If the eigenvalues of $Σ$ are all bounded, Condition (C2) will hold.

Theorem 2.1

Under Conditions (C1)-(C2), we have $T_{S} / σ_{S} \overset{d}{\to} N (0, 1)$ where $σ_{S}^{2} = \frac{H}{2} {tr}^{2} (Ω^{2})$ .

Then, we estimate $\hat{tr (Ω^{2})}$ as follows: $\hat{tr (Ω^{2})} = \frac{2}{n (n - 1)} \sum_{1 \leq s < t \leq n} (U_{s}^{⊤} U_{t})^{2} .$ By Proposition 1 of Zhao (Citation2023), we have $\hat{tr (Ω^{2})} / tr (Ω^{2}) \overset{p}{\to} 1$ under the null hypothesis as $n, p \to \infty$ . So by Theorem 2.1, we reject the null hypothesis if $T_{S} / {\hat{σ}}_{S} > z_{α}$ where ${\hat{σ}}_{S}^{2} = \frac{H}{2} \hat{{tr}^{2} (Ω^{2})}$ and $z_{α}$ is the upper α quantile of standard normal distribution.

Next, we consider the power function of our test procedure. Specially, we consider the following alternative hypothesis: (4) $H_{1} : ε_{t} = A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1},$ (4) where $u_{t}$ is a random vector uniformly distributed on the unit sphere in $R^{p}$ and $r_{t}$ is a nonnegative random variable independent of $u_{t}$ . Let $Σ_{0} = A_{0}^{⊤} A_{0}$ , $Σ_{1} = A_{1}^{⊤} A_{1}$ and $Σ_{01} = A_{0}^{⊤} A_{1}$ . We also assume the following condition for $A_{0}$ and $A_{1}$ .

(C3)

The eigenvalues of $Σ_{0}$ are all bounded and $tr (Σ_{1}) = O (p / n)$ , $tr (Σ_{1}^{2}) = O (p / n)$ , $tr (Σ_{0} Σ_{1}) = O (p / n)$ .

Theorem 2.2

Under $H_{1}$ in (Equation4(4) $H_{1} : ε_{t} = A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1},$ (4) ) with Condition (C3) holding, if $p / n \to γ \in (0, \infty)$ , we have, for H = 1, $\frac{T_{S} - \frac{1}{2} c_{1}^{2} ω^{4} n p^{- 2} tr (Σ_{0} Σ_{1})}{\sqrt{1 / 2} p^{- 2} ω^{4} tr (Σ_{0}^{2})} \overset{d}{\to} N (0, 1),$ where $c_{1} = E (r_{t}) E (r_{t}^{- 1})$ and $ω = p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0})$ .

Note that under Condition (C3), we have $tr (Ω^{2}) = \frac{tr (Σ_{0}^{2})}{{tr}^{2} (Σ_{0})} (1 + o (1))$ under the null hypothesis. So, by Theorem 2.2, the power function of $T_{S}$ is $β_{S} = lim_{n, p \to \infty} Φ (- z_{α} + \frac{c_{1}^{2} n tr (Σ_{0} Σ_{1})}{\sqrt{2} tr (Σ_{0}^{2})}) .$ In addition, according to Theorem 5 in Feng et al. (Citation2022), the power function of the sum-type test proposed by Feng et al. (Citation2022) is $β_{FLM} = lim_{n, p \to \infty} Φ (- z_{α} + \frac{n E^{2} (r_{t}) tr (Σ_{0} Σ_{1})}{\sqrt{2} E (r_{t}^{2}) tr (Σ_{0}^{2})})$ under Condition (C3). Thus, the asymptotic relative efficiency of our SS test with respect to FLM test is $ARE (SS, FLM) = lim_{p \to \infty} E^{2} (r_{t}^{- 1}) E (r_{t}^{2}) \geq lim_{p \to \infty} {E (r_{t}) E (r_{t}^{- 1})}^{2} \geq 1$ by Cauchy inequality. Next, we consider three special distributions for $ε_{t}$ .

$ε_{t} \sim N (0, I_{p})$ . So, $r_{t} E (r_{t}^{- 1}) \overset{p}{\to} 1$ and then $ARE (SS, FLM) = 1$ .
$ε_{t} \sim t_{p} (0, I_{p}, v)$ . So $E (r_{t}^{- 1}) = \frac{Γ {(v + 1) / 2}}{v^{1 / 2} Γ (v / 2)} \frac{Γ {(p - 1) / 2}}{Γ (p / 2)}, E (r_{t}^{2}) = \frac{pv}{v - 2}$ and then $ARE (SS, FLM) = \frac{2}{v - 2} {(\frac{Γ ((v + 1) / 2)}{Γ (v / 2)})}^{2} > 1.$ For v = 3, this value is about 2.54; for v = 4, it is about 1.76; for $v = \infty$ (multivariate normal distribution), it converges to one.
$ε_{t} \sim (1 - v) N (0, I_{p}) + vN (0, σ^{2} I_{p})$ . So $E (r_{t}^{- 1}) = \frac{{v + (1 - v) / σ} {v + (1 - v) σ^{2}}^{1 / 2}}{2^{1 / 2}} \frac{Γ {(p - 1) / 2}}{Γ (p / 2)}, E (r_{t}^{2}) = p (1 - v + v σ^{2})$ and then $ARE (SS, FLM) = \frac{1 + v (1 - v) {(σ - σ^{- 1})}^{2}}{1 + v (1 - v) {(1 - σ^{- 1})}^{2}} > 1.$

3. Simulations

We compare our method with the max-type test proposed by Chang et al. (Citation2017) (abbreviated as MAX), the sum-type test (abbreviated as FLM) and the Fisher's combined probability test (abbreviated as FC) proposed by Feng et al. (Citation2022). First, we consider the null hypothesis. To verify the robustness of the proposed testing method, we consider the following three scenarios for $ε_{t}$ :

Multivariate normal distribution. $ε_{t} \sim^{iid} N (0, Σ)$ ;
Multivariate t-distribution. $ε_{t} \sim^{iid} t (0, Σ, 3)$ ;
Multivariate mixture normal distribution. $ε_{t}$ 's are independently generated from $γ f_{p} (0, Σ) + (1 - γ) f_{p} (0, 9 Σ)$ , denoted by ${MN}_{p, γ, 9} (0, Σ)$ , where $f_{p} (\cdot, \cdot)$ is the density function of p-variate multivariate normal distribution. γ is chosen to be 0.8.

Thereinto, $Σ = (σ_{ij})_{1 \leq i, j \leq p}$ , $σ_{ii} = 1$ , $i = 1, \dots, p$ , $σ_{ij} = \frac{1}{2} (i - j)^{- 2}$ with $i \neq j$ . Table reports the empirical sizes of SS, MAX, FLM, FC tests with n = 100, 200 and p = 40, 80, 100. From Table , we observe that both FLM and SS tests can control the empirical sizes in most cases. However, the empirical sizes of MAX test are a little conservative under the multivariate normal distribution, while a litter larger than the nominal level under the multivariate t-distribution. And the empirical sizes of FC tests also have the same performance as MAX test.

Table 1. Size performance of different tests.

Download CSV Display Table

Next, we compare the empirical power performance of the above four tests. We consider three models for $ε_{t}$ :

VAR(1) model: $ε_{t} = A ε_{t - 1} + z_{t}$ ;
VMA(1) model: $ε_{t} = z_{t} + A z_{t - 1}$ ;
VARMA(1) model: $ε_{t} = 0.5 A ε_{t - 1} + z_{t} + 0.5 A z_{t - 1}$ .

Here, ‘VAR(1)’, ‘VMA(1)’ and ‘VARMA(1)’ are the abbreviations of 1-order vector autoregressive process, vector moving average process and vector autoregressive moving average process, respectively. Here, $z_{t}$ are generated from Scenarios (I)–(III) with $Σ = I_{p}$ . Let $A = (a_{ij})_{1 \leq i, j \leq p}$ . We consider the alternative hypothesis with $a_{ij} \neq 0,$ for $1 \leq i, j \leq m$ and $a_{ij} = 0$ otherwise. Here, m controls the signal strength and sparsity of $A$ . We consider two cases for m : (1) dense case: $m = [0.8 p]$ and $a_{ij} \sim U (- \frac{1}{4 \sqrt{m}}, \frac{1}{4 \sqrt{m}})$ for $1 \leq i, j \leq m$ ; (2) sparse case: $m = [0.05 p]$ and $a_{ij} \sim U (- \frac{3}{4 \sqrt{m}}, \frac{3}{4 \sqrt{m}})$ for $1 \leq i, j \leq m$ . Table reports the empirical power of the above four tests with n = 200, p = 80. Under the multivariate normal distribution, the performance of SS test is similar to FLM test, which is consistent to the theoretical result. Under the dense case, the powers of sum-type tests–SS and FLM are more powerful than MAX and FC tests. However, MAX and FC tests outperform SS and FLM tests under the sparse case. For the heavy-tailed distributions, our SS test has better performance than FLM test, which shows the advantage of the spatial-sign-based method. In addition, we found that the powers of the SS and FLM tests with H = 1 are larger than those tests with H = 2, 3. It is not strange because we consider the alternative hypothesis with 1-order. How to choose the best H for the general case deserves some further studies.

Table 2. Power performance of different tests with n = 200, p = 80.

Download CSV Display Table

4. Application

In this section, we are interested in testing whether the error series ${ε_{t}}$ under Sharpe-Lintner Capital Asset Pricing Model (CAPM) (Lintner, Citation1975; Sharpe, Citation1964) is white noise, i.e. (5) $H_{0} : {ε_{t}} is white noise v.s. H_{1} : {ε_{t}} is not white noise,$ (5) where $ε_{t} = (ε_{t 1}, \dots, ε_{tp})^{⊤}$ and p is the number of securities. The CAPM model is one of the most popular factor pricing models in finance, which has the explicit form $Y_{ti} = r_{ti} - r_{ft} = α_{i} + β_{i 1} (r_{mt} - r_{ft}) + ε_{ti}$ for $t \in {1, \dots, n}$ and $i \in {1, \dots, p}$ , where $r_{ti}$ is the return of the ith security at time t, $r_{ft}$ is the risk free rate at time t, $Y_{ti} = r_{ti} - r_{ft}$ is the excess return of the ith security at time t and $r_{mt}$ is the market return at time t. We gathered return data for the securities listed in the S&P 500 index. Initially, we compiled the monthly returns of all the securities in the S&P 500 index from January 2005 to November 2018. As the composition of the index changes over time, we focused on p = 374 securities that remained in the S&P 500 index throughout the entire period. This resulted in a total of T = 165 consecutive observations. The time series data for the safe rate of return and the market factors were sourced from Ken French's data library web page. We selected the one-month US treasury bill rate as the risk-free rate ( $r_{ft}$ ). The value-weighted return on all NYSE, AMEX, and NASDAQ stocks from CRSP served as a proxy for the market return ( $r_{mt}$ ).

Specifically, we let ${\hat{ε}}_{ti} ≐ Y_{ti} - {\hat{α}}_{i} - {\hat{β}}_{i 1} (r_{mt} - r_{ft}),$ where ${\hat{α}}_{i}$ and ${\hat{β}}_{i 1}$ are the ordinary least-squares (OLS) estimators of $α_{i}$ and $β_{i 1}$ , respectively. To demonstrate the usefulness of the proposed test, we treat ${\hat{ε}}_{t} = ({\hat{ε}}_{t 1}, \dots, {\hat{ε}}_{tp})^{⊤}$ as the observation of $ε_{t}$ , instead of considering the testing problem within the CAPM model. First, we applied SS, FLM, MAX and CC tests for the total samples. All the tests reject the null hypothesis at significant level 0.05. We apply the Box-Pierce test, a conventional autocorrelation test, to the residuals of each security under the CAPM model with the total sample sizes. The histogram of p-values from the Box-Pierce test for the U.S. datasets is shown in Figure . We also observe some autocorrelations for some securities. So for the following application, we employ the sliding window method. With a predetermined length n, we carry out each of the necessary tests on the data gathered from the period spanning τ to $τ + n - 1$ for every τ in the range of ${1, \dots, T - n}$ . Here, ${τ, \dots, τ + n - 1}$ represents the sliding window of length n. Figure displays the p-values for each test with a length of n = 36 (equivalent to three years). Our findings indicate that the MAX test is unable to reject the null hypothesis when the sample size is not sufficiently large. In most instances, the p-values of our proposed SS test are lower than those of the FLM test. Furthermore, we documented the frequency of null hypothesis rejection for the FLM and SS tests in these T−n test results, corresponding to the T−n sliding windows, which resulted in rates of 0.74 and 0.84, respectively. This indicates that our proposed SS test outperforms the other three tests in this application, primarily due to the heavy-tailed nature of the residuals. Figure provides some Q-Q plots of these residuals.

Figure 1. Histogram of p-values of the Box-Pierce of residuals of CAPM model for U.S.'s datasets.

Figure 2. p-values of white noise tests for the U.S.'s datasets with CAPM model.

Figure 3. Q–Q plots of the residuals of three securities with CAPM model.

5. Conclusion

In this study, we develop a new test procedure for high-dimensional white noise testing based on spatial-sign. Unlike the method proposed by Feng et al. (Citation2022), which constructs their test statistic using sample auto-covariance, our test statistic is constructed using auto-spatial-sign-covariance.We provide theoretical results for our proposed methods under the assumption of an elliptical distribution. Furthermore, we establish the asymptotic relative efficiency of our test in comparison to the sum-type test proposed by Feng et al. (Citation2022). Our simulation studies and application to real data demonstrate the superiority of our method over the sum-type test proposed by Feng et al. (Citation2022), particularly for heavy-tailed distributions.

As revealed in our simulation studies, our proposed test does not perform well against sparse alternatives. However, the max-type test proposed by Feng et al. (Citation2022) can not control the empirical sizes under heavy tailed distributions. This makes it intriguing to develop a max-type test procedure based on spatial-sign for high-dimensional white noise testing. Moreover, Feng et al. (Citation2022) demonstrated the asymptotic independence between their sum-type test statistic and the max-type test statistic. It would be interesting to investigate if the spatial-sign-based max-type test statistic also exhibits asymptotic independence with our proposed spatial-sign based sum-type test statistic. If this is the case, then we could construct a new combined test procedure based on this finding.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The research of Dachuan Chen is supported by the National Natural Science Foundation of China (Grants 12101335 and 12271271), the Natural Science Foundation of Tianjin (Grant 21JCQNJC00020), the Fundamental Research Funds for the Central Universities, Nankai University (Grants 63211088, 63221050, and 63231013) and Wukong Investment Research Funds. Zhao and Wang's research is partially supported by the China National Key R&D Program under Grant Nos. 2022YFA1003703, 2022YFA1003800, and 2019YFC1908502, the National Natural Science Foundation of China under Grant Nos. 12226007, 12271271, 11925106, 12231011, 11931001 and 11971247, the Fundamental Research Funds for the Central Universities under Grant No. ZB22000105 and Shenzhen Wukong Investment Management Co. Ltd..

References

Cai, J., & Kwan, M. P. (2022). Detecting spatial flow outliers in the presence of spatial autocorrelation. Computers, Environment and Urban Systems, 96, 101833. https://doi.org/10.1016/j.compenvurbsys.2022.101833
Web of Science ®Google Scholar
Chang, J., Yao, Q., & Zhou, W. (2017). Testing for high-dimensional white noise using maximum cross-correlations. Biometrika, 104(1), 111–127. https://doi.org/10.1093/biomet/asw066
Web of Science ®Google Scholar
Chen, D., Song, F., & Feng, L. (2022). Rank based tests for high dimensional white noise. arXiv:2204.08402.
Google Scholar
Feng, L., & Liu, B. (2017). High-dimensional rank tests for sphericity. Journal of Multivariate Analysis, 155, 217–233. https://doi.org/10.1016/j.jmva.2017.01.003
Web of Science ®Google Scholar
Feng, L., Liu, B., & Ma, Y. (2021). An inverse norm sign test of location parameter for high-dimensional data. Journal of Business & Economic Statistics, 39(3), 807–815. https://doi.org/10.1080/07350015.2020.1736084
Web of Science ®Google Scholar
Feng, L., Liu, B., & Ma, Y. (2022). Testing for high-dimensional white noise. arXiv:2211.02964.
Google Scholar
Feng, L., & Sun, F. (2016). Spatial-sign based high-dimensional location test. Electronic Journal of Statistics, 10(2), 2420–2434. https://doi.org/10.1214/16-EJS1176
Web of Science ®Google Scholar
Feng, L., Zou, C., & Wang, Z. (2016). Multivariate-sign-based high-dimensional tests for the two-sample location problem. Journal of the American Statistical Association, 111(514), 721–735. https://doi.org/10.1080/01621459.2015.1035380
Web of Science ®Google Scholar
Hall, P., & Heyde, C. C. (1980). Martingale limit theory and its application. Academic Press.
Google Scholar
Hosking, J. R. (1980). The multivariate portmanteau statistic. Journal of the American Statistical Association, 75(371), 602–608. https://doi.org/10.1080/01621459.1980.10477520
Web of Science ®Google Scholar
Huang, X., Liu, B., Zhou, Q., & Feng, L. (2023). A high-dimensional inverse norm sign test for two-sample location problems. Canadian Journal of Statistics, 51(4), 1004–1033. https://doi.org/10.1002/cjs.v51.4
Web of Science ®Google Scholar
Li, W. K. (2003). Diagnostic checks in time series. Chapman and Hall/CRC.
Google Scholar
Li, Z., Lam, C., Yao, J., & Yao, Q. (2019). On testing for high-dimensional white noise. The Annals of Statistics, 47(6), 3382–3412. https://doi.org/10.1214/18-AOS1782
Web of Science ®Google Scholar
Li, W. K., & McLeod, A. I. (1981). Distribution of the residual autocorrelations in multivariate ARMA time series models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 43(2), 231–239. https://doi.org/10.1111/j.2517-6161.1981.tb01175.x
Google Scholar
Lintner, J. (1975). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. In Stochastic optimization models in finance (pp. 131–155). Elsevier.
Google Scholar
Liu, B., Feng, L., & Ma, Y. (2023). High-dimensional alpha test of linear factor pricing models with heavy-tailed distributions. Statistica Sinica, 33, 1389–1410.
Web of Science ®Google Scholar
Lütkepohl, H. (2005). New introduction to multiple time series analysis. Springer.
Google Scholar
Oja, H. (2010). Multivariate nonparametric methods with R: An approach based on spatial signs and ranks. Springer Science & Business Media.
Google Scholar
Paindaveine, D., & Verdebout, T. (2016). On high-dimensional sign tests. Bernoulli, 22(3), 1745–1769. https://doi.org/10.3150/15-BEJ710
Web of Science ®Google Scholar
Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. The Journal of Finance, 19(3), 425–442.
Web of Science ®Google Scholar
Shekhar, S., C. T. Lu, & Zhang, P. (2003). A unified approach to detecting spatial outliers. GeoInformatica, 7(2), 139–166. https://doi.org/10.1023/A:1023455925009
Web of Science ®Google Scholar
Tsay, R. S. (2020). Testing serial correlations in high-dimensional time series via extreme value theory. Journal of Econometrics, 216(1), 106–117. https://doi.org/10.1016/j.jeconom.2020.01.008
Web of Science ®Google Scholar
Wang, L., Peng, B., & Li, R. (2015). A high-dimensional nonparametric multivariate test for mean vector. Journal of the American Statistical Association, 110(512), 1658–1669. https://doi.org/10.1080/01621459.2014.988215
PubMed Web of Science ®Google Scholar
Zhao, P. (2023). Robust high-dimensional alpha test for conditional time-varying factor models. Statistics, 57(2), 444–457. https://doi.org/10.1080/02331888.2023.2180003
Web of Science ®Google Scholar
Zhao, P., Chen, D., & Zi, X. (2022). High-dimensional non-parametric tests for linear asset pricing models. Stat, 11(1), e490. https://doi.org/10.1002/sta4.v11.1
Web of Science ®Google Scholar
Zou, C., Peng, L., Feng, L., & Wang, Z. (2014). Multivariate sign-based high-dimensional tests for sphericity. Biometrika, 101(1), 229–236. https://doi.org/10.1093/biomet/ast040
Web of Science ®Google Scholar

Appendix

Proof of theorems

A.1. Proof of Theorem 2.1

Define $V_{nj} = \sum_{l = 1}^{H} \frac{1}{n - l} \sum_{i = l + 1}^{j - 1} U_{i - l}^{⊤} U_{j - l} U_{i}^{⊤} U_{j},$ for $j \in {3, \dots, n}$ and $W_{nk} = \sum_{i = 3}^{k} V_{ni}$ , $k \in {3, \dots, n}$ . Let $F_{i} ≐ σ {U_{1}, \dots, U_{i}}$ be the σ-field generated by ${U_{j}}_{j \leq i}$ . It is easy to show that $E (V_{ni} | F_{i - 1}) = 0$ and it follows that ${W_{nk}, F_{k} : 3 \leq k \leq n}$ is a zero mean martingale. Let $v_{ni} = E (V_{ni}^{2} | F_{i - 1})$ , $3 \leq i \leq n$ and $V_{n} = \sum_{i = 3}^{n} v_{ni}$ . The central limit theorem (Hall & Heyde, Citation1980) will hold if we can show (A1) $\frac{V_{n}}{Var (W_{nn})} \overset{p}{\to} 1,$ (A1) and for any $ϵ > 0$ , (A2) $\sum_{i = 3}^{n} σ_{S}^{- 2} E [V_{ni}^{2} I {| V_{ni} | > ϵ σ_{S}} | F_{i - 1}] \overset{p}{\to} 0.$ (A2) It can be shown that $\begin{aligned} v_{ni} & = \sum_{h, g = 1}^{H} \frac{1}{(n - h) (n - g)} \sum_{s = h + 1}^{i - 1} \sum_{t = g + 1}^{i - 1} E (U_{s - h}^{⊤} U_{i - h} U_{s}^{⊤} U_{i} U_{t - g}^{⊤} U_{i - g} U_{t}^{⊤} U_{i} ∣ F_{i - 1}) \\ = \sum_{h = 1}^{H} \frac{1}{(n - h)^{2}} \sum_{s = h + 1}^{i - 1} (U_{s - h}^{⊤} U_{i - h})^{2} U_{s}^{⊤} Ω U_{s} \\ + \sum_{h = 1}^{H} \frac{2}{(n - h)^{2}} \sum_{h + 1 \leq s < t \leq i - 1} U_{s - h}^{⊤} U_{i - h} U_{t - h}^{⊤} U_{i - h} U_{s}^{⊤} Ω U_{t} . \end{aligned}$ So $\begin{aligned} \frac{V_{n}}{Var (W_{nn})} & = σ_{S}^{- 2} \sum_{i = 3}^{n} \sum_{h = 1}^{H} \frac{1}{(n - h)^{2}} \sum_{s = h + 1}^{i - 1} (U_{s - h}^{⊤} U_{i - h})^{2} U_{s}^{⊤} Ω U_{s} \\ + σ_{S}^{- 2} \sum_{i = 3}^{n} \sum_{h = 1}^{H} \frac{2}{(n - h)^{2}} \sum_{h + 1 \leq s < t \leq i - 1} U_{s - h}^{⊤} U_{i - h} U_{t - h}^{⊤} U_{i - h} U_{s}^{⊤} Ω U_{t} \\ ≐ C_{n 1} + C_{n 2} . \end{aligned}$ Simple algebras lead to $E (C_{n 1}) = \frac{2}{H} \sum_{h = 1}^{H} \frac{1}{(n - h)^{2}} \sum_{i = h + 2}^{n} (i - h - 1) = \frac{1}{H} \sum_{h = 1}^{H} \frac{n - h - 1}{n - h} \to 1$ as $n \to \infty$ . And $\begin{aligned} Var (C_{n 1}) & \leq H σ_{S}^{- 4} \sum_{h = 1}^{H} \frac{1}{(n - h)^{4}} Var [\sum_{i = 3}^{n} \sum_{s = h + 1}^{i - 1} (U_{s - h}^{⊤} U_{i - h})^{2} U_{s}^{⊤} Ω U_{s}] \\ \leq \frac{H}{(n - H)^{4} σ_{S}^{4}} \sum_{h = 1}^{H} Var [\sum_{i = 3}^{n} \sum_{s = h + 1}^{i - 1} (U_{s - h}^{⊤} U_{i - h})^{2} U_{s}^{⊤} Ω U_{s}] \\ = O (n^{- 1} \frac{E^{2} ((U_{t}^{⊤} Ω U_{t})^{2})}{{tr}^{4} (Ω^{2})}) \to 0 \end{aligned}$ by Lemma 1 in Wang et al. (Citation2015). Thus, we have $C_{n 1} \overset{p}{\to} 1$ . Similarly, $E (C_{n 2}) = 0$ and $Var (C_{n 2}) = O {\frac{E^{2} {{(U_{1}^{⊤} Ω U_{2})}^{2}} + n^{- 1} E^{2} {{(U_{1}^{⊤} Ω U_{1})}^{2}}}{{tr}^{4} (Ω^{2})}} \to 0.$ So $C_{n 2} \overset{p}{\to} 0$ . Consequently, (EquationA1(A1) $\frac{V_{n}}{Var (W_{nn})} \overset{p}{\to} 1,$ (A1) ) holds.

To show (EquationA2(A2) $\sum_{i = 3}^{n} σ_{S}^{- 2} E [V_{ni}^{2} I {| V_{ni} | > ϵ σ_{S}} | F_{i - 1}] \overset{p}{\to} 0.$ (A2) ), we only need to prove that $\sum_{i = 3}^{n} E (V_{ni}^{4}) = o (σ_{S}^{4}) .$ By Lemma 1 in Wang et al. (Citation2015), we can show that $\sum_{i = 3}^{n} E (V_{ni}^{4}) = O (n^{- 2} E^{2} (U_{1}^{⊤} U_{2})^{4} + n^{- 1} E^{2} ((U_{1}^{⊤} U_{2})^{2} (U_{1}^{⊤} U_{3})^{2})) = o ({tr}^{4} (Ω^{2})) .$ Here we complete the proof.

A.2. Proof of Theorem 2

$\begin{aligned} U (ε_{t}) & = U (A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1}) = \frac{A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1}}{| | A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1} | |} \\ = \frac{A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1}}{r_{t} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0})} \frac{r_{t} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0})}{| | A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1} | |} \\ = (A_{0} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t} + A_{1} r_{t - 1} r_{t}^{- 1} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t - 1}) (1 + γ_{t})^{- 1 / 2}, \end{aligned}$ where $\begin{aligned} γ_{t} & = \frac{| | A_{0} r_{t} u_{t} + A_{1} r_{t - 1} u_{t - 1} | |^{2}}{r_{t}^{2} p^{- 1} tr (Σ_{0})} - 1 \\ = \frac{r_{t}^{2} u_{t}^{⊤} Σ_{0} u_{t} + 2 r_{t} r_{t - 1} u_{t}^{⊤} Σ_{01} u_{t - 1} + r_{t - 1}^{2} u_{t - 1}^{⊤} Σ_{1} u_{t - 1}}{r_{t}^{2} p^{- 1} tr (Σ_{0})} - 1 \\ = (\frac{u_{t}^{⊤} Σ_{0} u_{t}}{p^{- 1} tr (Σ_{0})} - 1) + \frac{2 r_{t - 1} u_{t}^{⊤} Σ_{01} u_{t - 1}}{r_{t} p^{- 1} tr (Σ_{0})} + \frac{r_{t - 1}^{2} u_{t - 1}^{⊤} Σ_{1} u_{t - 1}}{r_{t}^{2} p^{- 1} tr (Σ_{0})} \\ = G_{1} + G_{2} + G_{3} . \end{aligned}$ By lemma 4 in Zou et al. (Citation2014) and condition (C3), we have $E (G_{1}^{2}) = O (tr (Σ_{0}^{2}) / p^{2}) = O (p^{- 1}), E (G_{2}^{2}) = O (tr (Σ_{0} Σ_{1}) / p^{2}) = O (p^{- 1} n^{- 1})$ and $E (G_{3}^{2}) = O (p^{- 2} ({tr}^{2} (Σ_{1}) + tr (Σ_{1}^{2}))) = O (n^{- 1} p^{- 1} + n^{- 2})$ . So $γ_{t} = O_{p} (p^{- 1 / 2})$ . Thus, by taking the same procedure as the proof of Theorem 1 in Zhao et al. (Citation2022), we have $\begin{aligned} T_{S} & = \frac{1}{n - 1} \sum_{2 \leq s < t \leq n} U_{s - 1}^{⊤} U_{t - 1} U_{s}^{⊤} U_{t} \\ = \frac{1}{n - 1} \sum_{2 \leq s < t \leq n} (A_{0} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{s - 1} + A_{1} r_{s - 2} r_{s - 1}^{- 1} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{s - 2})^{⊤} \\ \times (A_{0} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t - 1} + A_{1} r_{t - 2} r_{t - 1}^{- 1} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t - 2}) \\ \times (A_{0} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{s} + A_{1} r_{s - 1} r_{s}^{- 1} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{s - 1})^{⊤} \\ \times (A_{0} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t} + A_{1} r_{t - 1} r_{t}^{- 1} p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}) u_{t - 1}) + o_{p} (p^{- 1}) . \end{aligned}$ Let $ω = p^{- 1 / 2} {tr}^{1 / 2} (Σ_{0}), δ_{t} = r_{t - 1} r_{t}^{- 1}$ . We can decompose $T_{S}$ as $\begin{aligned} T_{S} & = \frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} u_{s}^{⊤} A_{0}^{⊤} A_{0} u_{t} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1} \\ + \frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} δ_{t - 1} δ_{s - 1} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1} \\ + D_{1} + D_{2} + D_{3} + o_{p} (p^{- 1}), \end{aligned}$ where $\begin{aligned} D_{1} = & \frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} (δ_{s - 1} δ_{s - 2} δ_{t - 1} δ_{t - 2} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 2}^{⊤} A_{1}^{⊤} A_{1} u_{s - 2} \\ + δ_{s - 2} δ_{t - 2} u_{s}^{⊤} A_{0}^{⊤} A_{0} u_{t} u_{t - 2}^{⊤} A_{1}^{⊤} A_{1} u_{s - 2}), \\ D_{2} & = \frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} (δ_{t - 1} u_{s}^{⊤} A_{0}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1} + δ_{s - 1} u_{s - 1}^{⊤} A_{1}^{⊤} A_{0} u_{t} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1} \\ + δ_{s - 1} δ_{t - 1} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{1} u_{s - 2} + δ_{s - 1} δ_{t - 1} δ_{t - 2} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 2}^{⊤} A_{1}^{⊤} A_{0} u_{s - 1} \\ + δ_{t - 2} u_{s}^{⊤} A_{0}^{⊤} A_{0} u_{t} u_{t - 2}^{⊤} A_{1}^{⊤} A_{0} u_{s - 1} + δ_{s - 2} u_{s}^{⊤} A_{0}^{⊤} A_{0} u_{t} u_{t - 1}^{⊤} A_{0}^{⊤} A_{1} u_{s - 2} \\ + δ_{t - 1} δ_{t - 2} δ_{s - 2} u_{s}^{⊤} A_{0}^{⊤} A_{1} u_{t - 1} u_{t - 2}^{⊤} A_{1}^{⊤} A_{1} u_{s - 2} + δ_{s - 1} δ_{t - 2} δ_{s - 2} u_{s - 1}^{⊤} A_{1}^{⊤} A_{0} u_{t} u_{t - 2}^{⊤} A_{1}^{⊤} A_{1} u_{s - 2}), \\ D_{3} & = \frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} (δ_{t - 1} δ_{s - 2} u_{s}^{⊤} A_{0}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{1} u_{s - 2} + δ_{s - 1} δ_{t - 2} u_{s - 1}^{⊤} A_{1}^{⊤} A_{0} u_{t} u_{t - 2}^{⊤} A_{1}^{⊤} A_{0} u_{s - 1} \\ + δ_{t - 1} δ_{t - 2} u_{s}^{⊤} A_{0}^{⊤} A_{1} u_{t - 1} u_{t - 2}^{⊤} A_{1}^{⊤} A_{0} u_{s - 1} + δ_{s - 1} δ_{s - 2} u_{s - 1}^{⊤} A_{1}^{⊤} A_{0} u_{t} u_{t - 1}^{⊤} A_{0}^{⊤} A_{1} u_{s - 2}) . \end{aligned}$ After some tedious algebra, we have $E (D_{1}^{2}) = o (p^{- 2}), E (D_{2}^{2}) = o (p^{- 2}), E (D_{3}^{2}) = o (p^{- 2})$ by Condition (C3). Taking the same procedure as the proof of Theorem 2.1, we can show that $\frac{1}{(n - 1) \sqrt{p^{- 4} {tr}^{2} (Σ_{0}^{2}) / 2}} \sum_{2 \leq s < t \leq n} u_{s}^{⊤} A_{0}^{⊤} A_{0} u_{t} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1} \overset{d}{\to} N (0, 1) .$ And $\begin{aligned} E (\frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} δ_{t - 1} δ_{s - 1} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1}) & = \frac{1}{2} c_{1}^{2} ω^{4} n p^{- 2} tr (Σ_{0} Σ_{1}), \\ Var (\frac{ω^{4}}{n - 1} \sum_{2 \leq s < t \leq n} δ_{t - 1} δ_{s - 1} u_{s - 1}^{⊤} A_{1}^{⊤} A_{1} u_{t - 1} u_{t - 1}^{⊤} A_{0}^{⊤} A_{0} u_{s - 1}) & = c_{1}^{2} ω^{4} n p^{- 2} tr (Σ_{0} Σ_{1}) = o (p^{- 2}) . \end{aligned}$ Thus, we have $\frac{T_{S} - \frac{1}{2} c_{1}^{2} ω^{4} n p^{- 2} tr (Σ_{0} Σ_{1})}{\sqrt{1 / 2} p^{- 2} ω^{4} tr (Σ_{0}^{2})} \overset{d}{\to} N (0, 1) .$

Spatial-sign-based high-dimensional white noises test

Abstract

1. Introduction

2. Test procedure

3. Simulations

Table 1. Size performance of different tests.

Table 2. Power performance of different tests with n = 200, p = 80.

4. Application

5. Conclusion

Disclosure statement

References

Appendix

Proof of theorems

A.1. Proof of Theorem 2.1

A.2. Proof of Theorem 2

Information for

Open access

Opportunities

Help and information

Spatial-sign-based high-dimensional white noises test

Abstract

1. Introduction

2. Test procedure

3. Simulations

Table 1. Size performance of different tests.

Table 2. Power performance of different tests with n = 200, p = 80.

4. Application

5. Conclusion

Disclosure statement

Additional information

Funding

References

Appendix

Proof of theorems

A.1. Proof of Theorem 2.1

A.2. Proof of Theorem 2

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date