10,059
Views
2
CrossRef citations to date
0
Altmetric
Original Articles

A Heteroskedasticity-Robust F-Test Statistic for Individual Effects

&

Abstract

We derive the asymptotic distribution of the standard F-test statistic for fixed effects, in static linear panel data models, under both non-normality and heteroskedasticity of the error terms, when the cross-section dimension is large but the time series dimension is fixed. It is shown that a simple linear transformation of the F-test statistic yields asymptotically valid inferences and under local fixed (or correlated) individual effects, this heteroskedasticity-robust F-test enjoys higher asymptotic power than a suitably robustified Random Effects test. Wild bootstrap versions of these tests are considered which, in a Monte Carlo study, provide more reliable inference in finite samples.

JEL Classification:

1. INTRODUCTION

In an earlier article, Orme and Yamagata (Citation2006) added to the already large literature on the analysis of variance testing, by establishing that, in a static linear panel data model, the standard F-test for individual effects remains asymptotically valid (large N, fixed T) under non-normality of the error term. Moreover, their (local) asymptotic analysis, supported by Monte Carlo evidence, showed that under (pure) local random effects both the F-test and Random Effects test (RE-test) will have similar power whilst under local fixed effects, or random effects which are correlated with the regressors, the RE-test procedure will have lower asymptotic power than the F-test procedure.

The key result in the above article (Proposition 1, p. 409) is, essentially, the asymptotic equivalence of the appropriately centred F-test statistic and the numerator (test indicator) in the RE-test statistic, under homoskedastic, but not necessarily normally distributed, errors. However, it is straightforward to verify (Proposition 1 in Section 3.2 below) that this asymptotic equivalence continues to hold under general heteroskedasticity of the errors.Footnote 1 The analysis which produces this result also predicts that, under certain forms of neglected heteroskedasticity, the standard (homoskedastic-based) F and RE tests will be, either, asymptotically under or over-sized. For example: (i) under cross-sectional heteroskedasticity only, both tests will be asymptotically oversized; (ii) under time series heteroskedasticity and serial independence of the errors, both tests will be asymptotically undersized, but under symmetric time series conditional heteroskedasticity such as GARCH, where the squared error terms exhibit positive correlation, both tests will be asymptotically oversized; and (iii) furthermore, in the singular case of independently and identically distributed (i.i.d.) data, over both the cross-section and time dimensions, then even if the errors are conditionally heteroskedastic, the standard F and RE tests remain asymptotically valid. The assumptions in this article explicitly allow for independently but not identically distributed data and, therefore, unconditional heteroskedasticity in the errors.

Given the result of Proposition 1, below, Wooldridge's (2010, p. 299) heteroskedastic-robust RE-test suggests a number of possible transformations of the standard F-test statistic which will recover its asymptotic validity under general heteroskedasticity of unknown form. Moreover, this transformation, or correction, involves simple functions of the pooled model's residuals (i.e., the restricted residuals). Following the literature on heteroskedasticity robust inference, restricted residuals are employed as advocated, for example, by Davidson and MacKinnon (Citation1985) and Godfrey and Orme (Citation2004), who report reliable sampling performance of tests of linear restrictions in the linear model when employing restricted residuals in the construction of heteroskedasticity robust standard errors.Footnote 2

Importantly, though, the F and RE heteroskedastic-robust tests, so constructed, retain the qualitative properties that were reported by Orme and Yamagata (Citation2006). Specifically: (i) under (pure) local random effects, both tests have the same asymptotic power; and, (ii) under local fixed effects, or random effects which are correlated with the regressors, the RE-test procedure will have lower asymptotic power than the F-test procedure.

The plan of this article is as follows. In order to make the current article self-contained, Section 2 reproduces Orme and Yamagata (Citation2006, Section 2) and introduces the notation and standard test statistics as discussed widely in econometric texts; for example Baltagi (Citation2008). Section 3 details the assumptions and asymptotic analysis. The latter provides a description of the asymptotic behaviour of the F-test statistic, its heteroskedasticity robust transformation, its relationship with the RE-test statistic (under both the null and local alternatives), and predictions concerning the asymptotic significance levels of the unadjusted F-test test under certain forms of neglected heterokedasticity. All proofs of the main results are relegated to the Appendix. Section 4 illustrates the main findings by reporting the results of a small Monte Carlo study. This also includes an evaluation of a wild bootstrap procedure scheme, based on Mammen (Citation1993) and Davidson and Flachaire (Citation2008), which might be employed in order to provide closer agreement between the desired nominal and the empirical significance level of the proposed test procedures. Section 5 concludes.

2. THE NOTATION, MODEL, AND TEST STATISTICS

We consider the static linear panel data model

where y i  = (y i1,…, y iT )′, u i  = (u i1,…, u iT )′, ι T is a (T × 1) vector of ones, and X i  = (x i1,…, x iT )′ a (T × K) matrix. The innovations, u it , have zero mean and uniformly bounded variances and the α i are the individual effects. By stacking the N equations of (Equation1), the model for all individuals becomes
where and are both (NT × 1) vectors, α = (α1,…, α N ) is a (N × 1) vector, D = [I N  ⊗ ι T ] is a (NT × N) matrix, is a (NT × K) matrix, and [D, X] has full column rank. Thus, for the purposes of the current exposition, x it  = (x it1,…, x itK )′, (K × 1), contains no time invariant regressors, in particular a constant term corresponding to an overall intercept. In the context of fixed effects this allows estimation of β 1, as follows.

In general, define the projection matrices, P B = B(B′ B)−1 B′ and M B = I NT  − P B, for any (NT × S) matrix B of full column rank, with being the residual matrix from a multivariate least squares regression of B on D which is, of course, the within transformation. Then the fixed effects (least squares dummy variable) estimator of β 1 in (Equation2) is given by

The null model of no individual effects is the pooled regression model of

where , and Z i has rows x itK ) = {z itj }, j = 1,…, K + 1. The (pooled) regression of y on Z delivers the Ordinary Least Squares (OLS) estimator .

The standard F-test for fixed effects requires estimation of both (Equation2), treating the α i as unknown parameters, and (Equation4) whilst the standard RE-test only requires estimation of (Equation4). In order to provide a framework in which to investigate the limiting behaviour of the F-test and RE-test statistics, under both fixed and random effects, the individual effects are assumed to have the form α = β0 ι N  + δ, δ = (δ1,…, δ N ). Fixed effects correspond to the α i , i = 1,…, N, being fixed unknown parameters (or, equivalently, δ1 ≡ 0 with β0 and δ i , i = 2,…, N, being the fixed unknown parameters). The case of random effects is accommodated when the δ i , i = 1,…, N are random variables. Equations (Equation1) and (Equation2) will be employed to characterise the data generation process, with the restrictions of H 0: δ = δ1 ι N providing the null model of no individual effects (notice that δ = 0 belongs to this set of restrictions). Specifically, when considering the alternative of fixed effects, the (N − 1) restrictions placed on (Equation2) are H 0: Hα = 0, where H = [ι N−1, − I N−1], whilst for random effects the null is H 0: var i ) = 0.

The standard F and RE test statistics are defined as follows.

F -Test Statistic

This is constructed as

where is the restricted sum of squares (from the pooled regression (Equation4)) with , and is the unrestricted sum of squares (from the fixed effects regression (Equation2)) with , the residual vector from regressing on . If normality, homoskedasticity, and strong exogeneity were imposed such that, conditional on X, u i  ∼ N(0, σ2 I T ), i = 1,…, N, then a standard F-test would be exact. In the case of non-normal, but homoskedastic, errors Orme and Yamagata (Citation2006) demonstrated that a standard F-test would be asymptotically valid.

RE-Test Statistic

The usual RE-test statistic isFootnote 3

where , so that

R N has a limit standard normal distribution, as N → ∞, under H 0 and homoskedasticity but not necessarily normality of the errors.

3. ASYMPTOTIC PROPERTIES OF F N

In this section we describe the properties of F N , under both local fixed and random effects, by (i) deriving its asymptotic distribution, and (ii) establishing its asymptotic relationship with R N . In the subsequent analysis asymptotic theory is employed in which N → ∞ and T is fixed. To facilitate this, the following sections detail the assumptions that are made, which are of the sort found in, for example, (White (Citation2001), p. 120).

3.1. Assumptions

A1: (i) is an independent sequence;

(ii) E(u it  | X i , u i, t−1, u i, t−2,…) = 0, almost surely, for all i and t.

A2: (i) E(|z isj u it |2+η) ≤ Δ < ∞ for some η > 0, all s, t = 1,…, T, j = 1,…, K + 1, and all i = 1,…, N;

(ii) E(|z itj |4+η) ≤ Δ < ∞ for some η > 0, all t = 1,…, T, j = 1,…, K + 1, and all i = 1,…, N;

(iii) E(Z′ Z/N) is uniformly positive definite;

(iv) is uniformly positive definite;

(v) is uniformly positive definite;

(vi) is uniformly positive definite.

Assumption A1 imposes independent sampling of cross-section units and also, A1(ii), a strong exogeneity assumption on X i , so that ; thus ruling out (for example) lagged dependent variables. Assumption A1(ii) also constrains the u it to be conditionally serially uncorrelated, and thus serially uncorrelated, but not necessarily serially independent. In particular, this resembles a martingale difference assumption, but is more direct (see, for example, White (Citation2001, p. 54)) and accommodates most models of heteroskedasticity (including time series conditional heteroskedasticity such as GARCH and its relatives). If it were strengthened to that of u it being serially independent, conditionally on X i , GARCH processes, for example, would be ruled out. Together with Assumption A2, which explicitly allows for rather general heteroskedasticity in the disturbances, we obtain consistency and asymptotic normality of both the pooled and fixed effects least squares regression estimators ( and , respectively), and also consistency of the corresponding heteroskedasticity-robust covariance matrix estimators.Footnote 4 These results follow for the fixed effects estimator because Assumption A2(i) and (ii) also imply that and are both uniformly bounded. Thus, in particular, , , and are all O p (1), with and , as N → ∞, T fixed. If Assumption A1(ii) is weakened to , or even E(x it u it ) = 0 (zero contemporaneous correlation), is not guaranteed to be consistent and, when it is inconsistent, the F-test is asymptotically invalid anyway, even under normality; for example, in the presence of lagged dependent variables—see the discussion in (Wooldridge (Citation2010), Sections 10.5 and 11.6). Furthermore, note that Assumptions A1(ii) and A2(v) imply that is uniformly positive.

For the purposes of this article, in addition, we assume as follows:

A3: (i) E|u it |4+η ≤ Δ < ∞ for some η > 0, all t = 1,…, T, and all i = 1,…, N;

(ii) is uniformly positive.

A4: (i) , i = 1,…, N;

(ii) the δ i are independent, satisfying E[u it δ i ] = 0 and E i |4+η ≤ Δ < ∞, for all i = 1,…, N;

(iii) is uniformly positive, where δ′ = (δ1,…, δ N ).

Assumption A3 justifies the limit distribution obtained in Proposition 1 below, and as a consequence also that of R N . (In fact, Assumption A3(i) and Assumption A2(ii) actually imply Assumption A2(i), using the Cauchy–Schwartz inequality.) Assumption A4 characterizes the alternative data generation process and permits the investigation of asymptotic power, under local individual effects, by restricting the test criteria under consideration to be O p (1) with well defined limit distributions. Together with Assumptions A3(i) and A2(ii), Assumption A4(ii) implies E|u it δ i |2+η ≤ Δ < ∞ and E|z itj δ i |2+η ≤ Δ < ∞, for some η > 0, and all i = 1,…, N, t = 1,…, T, j = 1,…, K + 1. As well as fixed effects (with the δ i being nonstochastic) it also accommodates local heteroskedastic random effects, but which are uncorrelated with u i . If the δ i are also distributed independently of X i , then we have “pure” random effects whilst if the δ i are correlated with X i then we have “correlated” random effects. (As pointed out by Wooldridge (Citation2010, p. 287), in microeconometric applications of panel data models with individual effects, the term fixed effect is generally used to mean correlated random effects, rather than α i being strictly nonstochastic.)

3.2. The Asymptotic Distribution of F N

The results concerning the limiting behavior of both the F-test and RE-test are driven by the following lemma, which also substantiates the asymptotic validity of Wooldridge's (2010, p. 299) heteroskedasticity-robust test for unobserved effects; see Section 3.4.

Lemma 1

Define

and

Then under Assumptions A1 and A3,

for fixed T, as N → ∞.

The expression for κ N , whilst correct, is quite general as it simply exploits the fact that the u it are serially uncorrelated. Assumption A1(ii), however, implies something a little stronger and this affords a more refined expression for κ N which is discussed in Section 3.3. Before that discussion, however, the asymptotic distribution of F N , under non-normality and heteroskedasticity, is given by the following proposition.

Proposition 1

Define .

i.

Under model (Equation2) and Assumptions A1 to A4, , with

where H N is given in\ Lemma 1 and λ N  = O(1) is defined by

Σ N  = E[Z′ Z/N], ρ N  = E[Z′ Dδ/N], μ N  = E[δ′ D′ Dδ/N].

ii.

Furthermore, if , where κ N is defined in Lemma 1, then

Given our assumptions, note that both ω N and λ N are O(1) satisfying

and
respectively, with ω N uniformly positive by Assumption, although neither ω N or λ N need necessarily converge. The special case of no individual effects, with δ = δ1 ι N , yields λ N  ≡ 0, as it should (this includes the case of δ = 0).

As exploited by Orme and Yamagata (Citation2006), it is easy to show that if ξ N has an F distribution with n 1 = N − 1 and n 2 = N(T − 1) − K degrees of freedom, then , or approximately for large N, . Therefore, by Proposition 1, we can employ the following approximation, under the null,

for any choice of satisfying , implying that F ω can be used in an asymptotically valid “standard” F-test procedure.

Before proceeding to derive a suitable , note that under pure local random effects, with E i | X i ] = 0 and , with so that . In this case, we immediately obtain the following Corollary to Proposition 1 (the proof is omitted).

Corollary 1

Under the alternative of (pure) local random effects, and under the assumptions of Proposition 1,

for any choice of satisfying .

Therefore, a robust F-test, based on F ω, will have nontrivial asymptotic local power against pure random effects. In fact, and analogous to Orme and Yamagata (Citation2006), a stronger result will be established in Section 3.4. There it is shown that, under (pure) local random effects, a robust F-test procedure based on F ω will possess the same asymptotic power as a suitably “robustified” RE-test, of the sort proposed by Wooldridge (Citation2010, p. 299) or Häggström and Laitila (Citation2002). However, under “correlated” local random effects a robust F-test will possess higher asymptotic power than a robust RE-test.

3.3. Asymptotically Valid F -Test Statistics

As noted above, an asymptotically valid F-test can be constructed if there is a available satisfying Using restricted OLS (i.e., pooled) residuals a natural choice for might be

where and

Indeed, this choice is justified in Proposition 2 below; c.f., Wooldridge (Citation2010, p. 299).

However, another (perhaps more efficient) choice for , and thus , emerges if we exploit Assumption A1(ii).Footnote 5 To see this, first note that , where , so that κ N can equivalently be expressed as

Now, from Assumption A1(ii), E[w it w itm ] = 0, for all t ≥ 3 and m = 1,…, t − 1, so that (Equation8) becomes

where

A further simplification arises if, in addition to A1(ii), we can assume as follows:

i.

, for t > s > r.

In this case (Equation8) is

This assumption, however, restricts the admissibility of certain forms of time series conditional heteroskedasticity, as it rules out an asymmetric GARCH process for .Footnote 6

The same expression for κ N emerges if A1(iii) is strengthened to the following assumption:

A1 (iii) , almost surely, for all i and t.

This implies A1(iii) because by iterative expectations, and for t > s > r,

and, for the subsequent analysis in Section 3.5, it will be useful to note that in this case the are conditionally serially uncorrelated and κ N can also be expressed as

Whilst still allowing general forms of heteroskedasticity, A1(iii)′ does rule out time series conditional heteroskedasticity processes.

Finally, consider a strengthening of A1(iii) to the following assumption:

A1 (iii)′′ is a sequence of serially independent random variables, for all i = 1,…, N.

Then (Equation8) becomes

The preceding discussion suggests differing possible consistent estimators for κ N , and thus for ω N , according to: (i) whether, or not, Assumption A1(ii) is fully exploited; or, (ii) whether one of the additional A1(iii), A1(iii)′, or A1(iii)′′ is adopted. These are described in the following proposition.

Proposition 2

Define , , and

Under model (Equation2) and Assumptions A1 to A4, we have the following situation:

1.

and , j = 1, 2.

Under model (Equation2), Assumptions A1–A4 and either A1(iii), A1(iii)′, or A1(iii)′′, we have the following situation:

1.

and , j = 1, 2, 3.

From this analysis, it follows that asymptotically valid choices for include , j = 1, 2, 3, where, specifically,

depending on assumptions made about the u it , t = 1,…, T. Robust F-test statistics can then be constructed as , m = 1, 2, 3, and approximate inferences obtained based on (Equation7). Note that is very general, whereas is tailored to the main assumptions of the article. Thus we might expect better sampling behavior from using the latter, rather than the former, under the maintained assumptions A1–A4. Finally, is only valid under rather more restrictive assumptions.

3.4. The Relationship between F N and R N

Under the null of no individual effects, it is straightforward to show that

From (Equation6), Lemma 1, and Proposition 1, therefore, we can write

under the null, so that
for any choice of satisfying ; for example, under assumptions A1–A4 of this article. Moreover, this also substantiates Wooldridge's (2010, p. 299) suggestion for a heteroskedasticity-robust RE test statistic constructed as ; or, under under the more restrictive assumptions A1(iii), A1(iii)′, or A1(iii)′′, as proposed by Häggström and Laitila (Citation2002).

The following proposition extends this result to the case of local individual effects (fixed or random).

Proposition 3

Under model (Equation2) and Assumptions A1 to A4,

for any choice of satisfying , where γ N  = O(1) defined by
and the limit distribution of is given by Proposition 1.

Again, γ N need not converge, but it is O(1) and . As with Proposition 1, γ N  ≡ 0 obtains under H 0: δ = δ1 ι N , as it should, since (Z Z) −1 Z Dδ = (δ1, 0 ) and the top-left, (1, 1), element of is 0. As discussed above, under the alternative of (pure) local random effects ρ N  = 0, and we obtain the following Corollary, which is immediate from Corollary 1 given Proposition 3.

Corollary 2

Under the alternative of (pure) local random effects, and under the assumptions of Proposition 1,

for any choice of satisfying .

Thus, since under (pure) local random effects, , both the robust RE and robust F-test procedures, based on (Equation13) and (Equation7), respectively, will have identical asymptotic power functions. However, under local fixed effects or random effects which are correlated with X i , the robust F-test can have greater asymptotic power. In particular, when individual effects are correlated with the mean values of the regressors, ρ N  ≠ 0 and is O(1), implying γ N  > 0 so that a test based on R N (but suitably robust to heteroskedasticity) should have lower asymptotic local power than one based on F N . This makes intuitive sense, since F N is designed to test for individual effects which are correlated with , whereas R N is constructed on the assumption that the individual effects are uncorrelated with all regressor values. The importance of distinguishing between individual effects which are correlated or uncorrelated with regressors, rather than simply labelling them fixed or random, is discussed by Wooldridge (Citation2010, Section 10.2).

3.5. Analysis of the Standard F -Test and RE-test

Given the analysis above certain predictions can be made concerning the asymptotic behaviour, under the null hypothesis, of both the standard F-test, based on F N , and RE-test, based on R N , under specific assumptions about the data and/or forms of heteroskedasticity.

Serial Independence

Suppose are serially independent, or assumption A1(iii)′ holds, with Consider, first, the case of E[h it ] = σ2 < ∞, so that the errors are unconditionally homoskedastic. Then, κ N  = 2σ4and ω N  = 1. In this very restricted case, then, both the F-test and RE-test, based on F N and R N , respectively, remain asymptotically valid without any adjustment. In particular, this result is true if the are i.i.d., but the u it are conditionally heteroskedastic.

Cross-Section Heteroskedasticity

In this case, we rule out time series heteroskedasticity and adopt assumption A1(iii)′ with

so that is the unconditional variance and . Here, both the F-test based on F N and RE-test based on R N , without adjustment, will be asymptotically oversized (in that, asymptotically, both will reject a correct null of no individual effects too often for any given nominal significance level)Footnote 7 To demonstrate the result, one need only show that ω N  < 1 which is evidently true because

The same prediction is true in the case of unconditional heteroskedasticity, by which we mean , since the first (weak) inequality in (Equation15) can be replaced with equality.

Time Series Heteroskedasticity

Here we consider two scenarios which afford tractable results. Under the first scenario, assumption A1(iii)′ is, again, adopted. The second scenario allows for a GARCH process, but under the symmetry assumption of A1(iii).

i.

Consider unconditional time series heteroskedasticity, so that A1(iii)′ holds and

are constants. Here
because

This implies that both the F and RE-test procedures, without adjustment, will be asymptotically undersized.\newline To obtain a similar result for conditional time series heteroskedasticity, with , for all i and t, and , A(iii)′ needs to be strengthened to A(iii)′′ (serial independence) so that .

ii.

In order to provide a succinct analysis for the conditional time series heteroskedastic case, we restrict u it to be a stationary time series, for all i, such that A1(iii) holds either by implication of A1(iii)′ or by direct supposition. Thus, (symmetric) ARCH/GARCH specifications are allowed for but certain asymmetric ARCH/GARCH models with leverage are not. Exploiting stationarity, and heteroskedasticity in the time series dimension only, we express the unconditional variance and covariances as , , say, so that

where

Thus, if the are (serially) positively correlated, γ j  − σ4 > 0 and ω N  < 1 so that both the F and RE-test procedures, without adjustment, will be asymptotically oversized. The converse is true if the are (serially) negatively correlated. In the particular case of symmetric ARCH/GARCH processes, and with the usual positivity constraints on the parameters, the will be (serially) positively correlated,Footnote 8 so that the unadjusted F and RE-test procedures will be asymptotically oversized.

In order to shed light on the relevance of the preceding asymptotic analysis, the next section reports the results of a small Monte Carlo experiment which illustrates the asymptotic robustness of the F-test to non-normality/heteroskedasticity and its power properties relative to the RE-test.

4. MONTE CARLO STUDY

The Monte Carlo study investigates the sampling behavior of the test statistics considered above, (Equation7) and (Equation13), for differing choices of , including . As our analytical results suggest, the tests are justified when N → ∞ with T fixed, we consider (N, T) = (20, 5), (50, 5), (100, 5), (50, 10), (50, 20).

4.1. Monte Carlo Design

The model employed is

where z it, 1 = 1, z it, 2 is drawn from a uniform distribution on (1, 31) independently for i and t, and z it, 3 is generated following Nerlove (Citation1971), such that
where the value z i0, 3 is chosen as 5 + 10υ i0, and υ it (and υ i0) is drawn from the uniform distribution on (−0.5, 0.5) independently for i and t, in order to avoid any normality in regressors. These regressor values are held fixed over replications. Also, observe that the regression design is not quadratically balanced.Footnote 9 Without loss of generality, the coefficients are set as β j  = 1 for j = 1, 2, 3. The i.i.d. standardised errors for ϵ it are drawn from: the standard normal distribution (SN); the t distribution with five degrees of freedom (t 5); and, the chi-square distribution with six degrees of freedom .

We consider the following five specifications for σ it :Footnote 10

1.

Homoskedasticity (HET0)

2.

Cross-sectional one-break-in-volatility heteroskedasticity (HET1)

with N 1 = ⌈ N/2 ⌉, where ⌈ A ⌉ is the largest integer not less than A, σ1 = 0.5, and σ2 = 1.5.

3.

Time series one-break-in-volatility heteroskedasticity (HET2)

with T 1 = ⌈ T/2 ⌉, σ1 = 0.5, and σ2 = 1.5.

4.

Conditional heteroskedasticity depending on a regressor (HET3)

η c [·] is the inverse of the cumulative distribution function of chi-squared distribution with degrees of freedom c. Since z it, 2 is drawn from a uniform distribution on (1, 31), σ it has mean 1 and variance 2/c, so it is easy to control the degree of heteroskedasticity through the choice of c. We employ c = 1.

5.

Time Series conditional heteroskedasticity, GARCH(1,1) (HET4)

where

The value of parameters are chosen to be φ0 = 0.5, φ1 = 0.25, and φ2 = 0.25, and u i, −50 = 0 with the first 50 observations being discarded, so that the unconditional variance is .

6.

Time series conditional heteroskedasticity, GJR-GARCH(1,1) (HET5)

where

The value of parameters are chosen to be φ0 = 0.3, φ1 = 0.5, φ2 = 0.2, and φ3 = 0.23, and u i, −50 = 0 with the first 50 observations being discarded.Footnote 11

For power comparisons, the individual effects are generated according to

where the ϕ i are i.i.d. N(0, 1), with ι 3 = (1, 1, 1)′, being overall average of z it , s being the standard deviation of , and the R 2 is from the regression of (Equation18). With this set up, the variance of inside of the square brackets is always unity across designs. We consider two combinations of (τ i , R 2): (i) (τ i , R 2) = (0, 0), which is a simple null model specification, with α i  ≡ 0, and (ii) (τ i , R 2) = (v α, 1), which is simple fixed effects specification (given that the z it are fixed over replications).Footnote 12 To control the power, we consider .

4.2. Asymptotic Tests

Four versions of the FE and RE test statistics are considered, constructed using and , m = 1, 2, 3, as defined at (Equation10)–(Equation12), and all are based on the restricted estimator, .Footnote 13

1.

F-test statistics (denoted F ω in the Tables)

where
is the standard F-test statistic. The corresponding test procedure, for each separate statistic (Equation19), employs critical vales from an F distribution with n 1 and n 2 degrees of freedom, respectively, where n 1 = N − 1 and n 2 = N(T − 1) − K. That is, for each m = 0, 1, 2, 3, reject H 0 if , where Pr(ξ >c N, α) = α, for chosen α, and ξ ∼F(n 1, n 2)

2.

One sided (positive) RE-test statistics (denoted R ω in the tables)

where
is the one-sided (positive) standard RE-test statistic. The corresponding test procedure, for each separate statistic (Equation20), employs critical values from a N(0, 1) distribution. That is, for each m = 0, 1, 2, 3, reject H 0 if , where Pr(Z > z α) = α, for chosen α, and Z ∼ N(0, 1).

4.3. Bootstrap Tests

As is well known, asymptotic theory can provide a poor approximation to actual finite sample behaviour and that bootstrap procedures often lead to more reliable inferences.Footnote 14 Therefore, we also consider a simple wild bootstrap procedure scheme, based on Mammen (Citation1993) and Davidson and Flachaire (Citation2008), which might be employed in order to provide closer agreement between the desired nominal and the empirical significance level of the proposed test procedures and which has proved effective in previous studies; see, for example, Godfrey and Orme (Citation2004). The wild bootstrap is implemented using the following steps:

1.

Estimate the models (Equation2) and (Equation4) to get , i = 1,…, N, and construct test statistics and , m = 0, 1, 2, 3;

2.

Repeat the following B times:

a.

Generate , where the v it are i.i.d., over i and t, taking the discrete values ±1 with an equal probability of 0.5;

b.

Construct

obtain restricted and unrestricted OLS residuals and , respectively, and the restricted and unrestricted residual sums of squares ( and , respectively);

c.

Construct the bootstrap test statistics

and
where , m = 1, 2, 3 is constructed as in (Equation10)–(Equation12) but using , and ;

3.

Calculate the proportion of bootstrap test statistics, (respectively, ), from the B repetitions of Step 2c that are at least as large as the actual value of (respectively, ). Let this proportion be denoted by and the desired significance level be denoted by α. The asymptotically valid rejection rule, for each m, is that H 0 is rejected if .

The sampling behavior of all the above tests are investigated using 5000 replications of sample data and B = 200 bootstrap samples, employing a nominal 5% significance level.Footnote 15

Observe that the wild bootstrap scheme imposes symmetry on the . Because of this, it is readily shown that , in probability, m = 1, 2, 3, signifying that, for any δ > 0, as N → ∞, T fixed, where P* is the probability measure induced by the wild bootstrap conditional on the sample data. It can also be established that, for example, , in probability, implying that , where 𝒟 T (x) denotes the distribution function of a random variable. Combining these results, we obtain

which justifies the asymptotic validity of the wild bootstrap scheme for , m = 1, 2, 3, notwithstanding the fact the u it may not be asymmetrically distributed. This will not be the case, however, for the unadjusted F-test statistic . Thus, it will useful to investigate how the wild bootstrap performs in finite samples when the true errors are asymmetric.Footnote 16

4.4. Results

Before looking at the results from the Monte Carlo study, and drawing on the discussion in Godfrey et al. (Citation2006), it is important to define criteria to evaluate the performance of the different tests considered. Given the large number of replications performed, the standard asymptotic test for proportions can be used to test the null hypotheses that the true significance level is equal to its nominal value. In practice, however, what is important is not that the significance level of the test is identical to the chosen nominal level, but rather that the true and nominal rejection frequencies stay reasonably close, even when the test is only approximately valid. Following Cochran's (1952) suggestion, we shall regard a test as being robust, relative to a nominal value of 5%, if its actual significance level is between 4.5% and 5.5%. Considering the number of replications used in these experiments, estimated rejection frequencies within the range 3.9% to 6.1% are viewed as providing evidence consistent with the robustness of the test, according to our definition.Footnote 17

Under the null, with homoskedastic standard normal errors (reported in Table , H 0: α i  = 0), the rejection frequencies of both the asymptotic and tests are close to the nominal significance level of 5%. The asymptotic F-test based on , however, tends to under reject the null when T is relatively large, whilst suffers from large size distortion with empirical significance levels being considerably smaller than the nominal 5%. The size properties of the R ω tests, for different , are qualitatively similar to those of the F ω tests, but tend to have empirical significance levels that are smaller than those of the corresponding F ω tests. Turning our attention to the bootstrap tests, all the modified fixed and random effects tests control the empirical significance levels very well. The results are qualitatively similar for t 5 and errors and, confirming the analysis of Orme and Yamagata (Citation2006), appears quite robust to non-normality, whilst in these cases as well the bootstrap tests provide very close agreement between nominal and empirical significance levels, even for when the errors are asymmetric. Given these results, we now just compare the power of the bootstrap tests. All bootstrap F ω tests have very similar power, as do the bootstrap R ω tests. However, the power of the bootstrap F ω tests are uniformly higher than power of the corresponding bootstrap R ω tests which is as expected given the analysis in Section 3.4 because of the correlation between regressors and individual effects.

TABLE 1 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under homoskedastic errors (HET0)

The above results indicate that, even when the errors are homoskedastic, a wild bootstrap procedure still offers reliable finite sample inference for all variants of the FE and RE tests considered. Now let us look at the results under various heteroskedastic schemes. Table reports the results under cross-sectional one-break-in-volatility scheme (HET1). First, and as predicted by the analysis in Section 3.5, both the and tests reject the correct null too often. On the other hand, the empirical significance levels of the other F ω and R ω tests are very similar to those presented in homoskedastic case. As before, however, the bootstrap and tests provide close agreement between nominal and empirical significance levels, across all error distributions, so again it is sensible to focus only on the power properties of these tests. In contrast to the power properties under homoskedastic errors, under the HET1 scheme the power of bootstrap tests appear different across different variants. For example, and have similar powers but are slightly lower than that of , which is again slightly exceeded by that of . This feature is qualitatively similar for the tests, but is less striking. Finally, the results confirm again that has higher power than that of .

TABLE 2 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under cross-sectional one-break-in-volatility heteroskedastic scheme (HET1)

Table reports the test results under time-series one-break-in-volatility scheme (HET2). In contrast to the results with HET1 scheme, but still consistent with prediction of Section 3.5, both the and tests reject the null too infrequently, especially for N = 20, 50, 100 and T = 5. As before the bootstrap versions control the size very well, and, interestingly, the power ranking of the bootstrap tests is different than that obtained under HET1. In fact, the and tests (respectively, and tests) still have similar powers but they are now slightly higher than those of the and tests (respectively, and tests), which are in this case comparable.

TABLE 3 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under time-series one-break-in-volatility heteroskedastic scheme (HET2)

Based on the analysis in Section 3.5 it is possible to derive approximate null rejection frequencies of the test analytically, under the simple heteroskedastic schemes of HET1 and HET2. Given the “population” value of ω N , and a nominal significance level of α × 100%, the rejection frequency of the F N test is, approximately, Pr[F N  > c α, n1, n2], where Pr[F n1, n2 > c α, n1, n2] = α and F n1, n2 ∼ F(n 1, n 2). But this is identical to Pr[F n1, n2 > q], where q = ω N (c α, n1, n2 − 1) +1. More precisely, consider first the case of HET1 where a little calculation shows that, since N is always even in our experiments, ω N  = 0.781. Using α = 0.05, it is then straightforward to obtain q and Pr[F n1, n2 > q]. Similar calculations can be undertaken for the case HET2 but, here, ω N varies according to whether Tis even (ω N  = 1.02) or odd (ω N  = 1.13). From these calculations we obtain the following (approximate) significance levels for our choices of (N, T):

As can be seen, the obtained empirical significance levels, for F N , are qualitatively very similar to these predicted values.

Table summarises the results under conditional heteroskedasticity depending on a regressor z it, 2 (HET3), where σ it  = η1[(z it, 2 − 1)/30], i = 1,…, N, t = 1,…, T, and η1[·] is the inverse of the cumulative distribution function of the distribution. Since the z it, 2 are initially i.i.d. draws from a uniform distribution on (1, 31), the values of σ it (z it, 2) are realisations from a distribution. This means that even though for a given N (and T) σ it will be held fixed for each replication of data, possibly yielding a realisation of ω N  ≠ 1, as N increases a Law of Large Numbers implies that the given realisation of ω N will converge to unity. For example, when N = 20 and T = 5, ω N  = 1.36, yielding a predicted (approximate) significance level for F N of 1.9%, which explains the under-rejection of this test in our experiments. For larger sample sizes, the value of ω N does, indeed, tend to unity, and the empirical significance level of F N converges to the nominal level, as expected. Due to the larger average error variance encountered here, than that under other heteroskedastic schemes, the power of the tests are lower although, qualitatively, the results are very similar to those under HET0 but with and (respectively, and enjoying a slight power advantage and the tests being more powerful than their counterparts.

TABLE 4 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under conditional heteroskedasticity depending on a regressor (HET3)

The results under symmetric conditional heteroskedasticity, GARCH(1,1) (HET4), are reported in Table . Similar to the results obtained under HET1, and as predicted by the analysis of Section 3.5, the test rejects a correct null too frequently but the empirical significance levels of other variants of the F ω tests are very similar to those presented in homoskedastic case. Again, all the bootstrap tests control the empirical significance levels very well, and the power rankings are, from the lowest, and , followed by , then . The same comments apply to the bootstrap tests, which again exhibit lower power than their counterparts. The results under asymmetric conditional heteroskedasticity, GJR-GARCH(1,1) (HET5), are summarised in Table . In contrast to GARCH model, GJR-GARCH is an asymmetric model of heteroskedasticity with leverage, and in general, rendering inconsistent, meanwhile and remain consistent. Despite this, the experimental results are qualitatively very similar to those under GARCH model. All the bootstrap tests, including , control the empirical significance levels very well, and the power rankings of the and tests are very similar to those obtained under the symmetric GARCH models.

TABLE 5 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under conditional heteroskedasticity, GARCH(1,1) (HET4)

TABLE 6 Rejection frequencies of the asymptotic and wild-bootstrap modified F-tests and modified random effects tests under conditional heteroskedasticity, GJR-GARCH(1,1) (HET5)

5. CONCLUSIONS

This article has provided an asymptotic analysis of the sampling behaviour of the standard F-test statistic for fixed effects, in a static linear panel data model, under both non-normality and heteroskedasticity of the error terms, when the number of cross-sections, N, is large and T, the number of time periods, is fixed. First, it has been shown that a linear transformation of the commonly cited F and RE tests (using a simple function of restricted residuals) provides asymptotically valid test procedures, when employed in conjunction with the usual F and standard normal critical values (respectively). Second, it has been shown that the asymptotic relationship between the heteroskedastic robust F-test and the RE-test statistics, carries over from the homoskedastic case. That is, under (pure) local random effects, they share the same asymptotic power, whilst under local fixed (or correlated) individual effects the heteroskedastic robust F-test enjoys higher asymptotic power. Third, we have provided qualitative predictions about the approximate true significance levels of the standard F and RE Tests in the presence of certain forms of heteroskedasticity. These theoretical findings are supported by Monte Carlo evidence. Finally, although asymptotic theory does not always provide a good approximation to finite sample behaviour, our experiments show that all the wild bootstrap versions of these tests, employing the resampling scheme advocated by Davidson and Flachaire (Citation2008), yield reliable inferences in the sense of close agreement between nominal and actual significance levels. There are slight differences in the power properties of these tests, although none dominates across the different models of heteroskedasticity considered. Thus, for example, the wild boostrap version of the unadjusted F-test appears to behave quite favourably under homoskedasticity and general heteroskedasticity both in terms of finite sample significance levels and power, and even under asymmetric errors for which it is not asymptotically justified.

ACKNOWLEDGMENTS

We would like to thank two anonymous referees for their constructive comments on an earlier version of this article. Thanks also to Ralf Becker, Alastair Hall, and Andreea Halunga for useful discussions.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted.

Notes

Orme and Yamagata (Citation2006) did not cover the case of heteroskedastic errors in the linear model, although their analysis did allow for heteroskedastic individual effects.

As Wooldridge (Citation2010, p. 300) points out, standard tests for individual effects essentially test for non zero correlation in the errors; thus, constructing autocorrelation robust procedures would appear to be counter productive.

See, for example, Breusch and Pagan (Citation1980) or Honda (Citation1985).

See, for example, (White (Citation2001), Exercises 3.14, 5.12 and Chapter 6). Assumption A2(ii) is also required to obtain a heteroskedasticity robust F-test.

We shall not, here, consider alternative estimators of , although this is possible.

See, for example, Goncalves and Kilian (Citation2004).

Indeed, this particular conclusion explains some of the finite sample Monte Carlo results obtained by Häggström and Laitila (Citation2002).

He and Terasvirta (Citation1999) establish that the autocorrelation function of the squared process is positive.

See the discussion in Orme and Yamagata (Citation2006).

Note that the parameters chosen for specifications 5 and 6 ensure that E|u it |4+η exists for all error distributions; see Ling and McAleer (Citation2002).

We also considered an ARCH(1) specification. However, the associated results are not reported since they are qualitatively similar to the results for the GARCH(1,1) specification, which are presented below.

We also considered a pure random effects specification, τ i  = v α, R 2 = 0, and the results show that the power properties of the modified fixed effects test and the modified random effects test are very similar.

The estimator , based on the unrestricted estimator (i.e., fixed effects estimator), is also considered, but the finite sample performance of the tests considered is monotonically inferior to that based on the estimator of

See Godfrey (Citation2009) for an excellent guide to bootstrap test procedures for regression models.

It is often advocated that (B + 1)/100 should be an integer. However, running the experiments with B = 199 does not change the results.

Similarly to Goncalves and Kilian (Citation2004), this derives from the asymptotic invalidity of the wild bootstrap scheme when employed to estimate asymptotic standard errors associated with nonpivotal statistics.

Employing a standard asymptotic test these bounds are calculated as and

Notes: The model employed is , u it  = σ it ϵ it , where z it, 1 = 1, z it, 2 is drawn from a uniform distribution on (1, 31) independently for i and t, and z it, 3 is generated following Nerlove (1971), such that z it, 3 = 0.1t + 0.5z it−1, 3 + υ it , where the value z i0, 3 is chosen as 5 + 10υ i0, and υ it (and υ i0) is drawn from the uniform distribution on (−0.5, 0.5) independently for i and t, in order to avoid any normality in regressors. These regressor values are held fixed over replications. β j  = 1 for j = 1, 2, 3. The i.i.d. standardized errors for ϵ it are drawn from: the standard normal distribution (SN); the t distribution with five degrees of freedom (t 5); and, the chi-square distribution with six degrees of freedom . For estimating size of the tests, α i  = 0 and power is investigated using where g i (z i ) is the standardised value of , so that the regressors and α i are correlated. F ω is the modified F-test and R ω is the modified random effects test, and and are their wild bootstrap tests, with different choice of , m = 0, 1, 2, 3 with ; see section 4.2 and 4.3 Here σ it  = 1. The sampling behaviour of the tests are investigated using 5000 replications of sample data and 200 bootstrap samples, employing a nominal 5% significance level.

Notes: See notes to Table 1. The DGP is identical to that for Table 1 except σ it  = σ1, i = 1,…, N 1, t = 1,…, T, and σ it  = σ2, i = N 1 + 1,…, N, t = 1,…, T with N 1 = ⌈ N/2 ⌉, where ⌈ A ⌉ is the largest integer not less than A, σ1 = 0.5 and σ2 = 1.5.

Notes: See notes to Table 1. The DGP is identical to that for Table 1 except σ it  = σ1, i = 1,…, N, t = 1,…, T 1, σ it  = σ2, i = 1,…, N, t = T 1 + 1,…, T with T 1 = ⌈ T/2 ⌉, σ1 = 0.5, and σ2 = 1.5.

Notes: See notes to Table 1. The DGP is identical to that for Table 1 except σ it  = η c [(z it, 2 − 1)/30]/c, i = 1,…, N, t = 1,…, T, where η c [·] is the inverse of the cumulative distribution function of chi-squared distribution with degrees of freedom c. Since z it, 2 is drawn from a uniform distribution on (1, 31), σ it has mean 1 and variance 2/c, so it is easy to control the degree of heteroskedasticity through the choice of c. We employ c = 1.

Notes: See notes to Table 1. The DGP is identical to that for Table 1 except u it  = σ it ϵ it , t = −49,…, T, i = 1,…, N, where . The value of parameters are chosen to be φ0 = 0.5, φ1 = 0.25 and φ2 = 0.25.

Notes: See notes to Table 1. The DGP is identical to that for Table 1 except u it  = σ it ϵ it , t = −49,…, T, i = 1,…, N, where . The value of parameters are chosen to be φ0 = 0.3, φ1 = 0.5, φ2 = 0.2 and φ3 = 0.23.

REFERENCES

  • Baltagi , B. H. ( 2008 ). Econometric Analysis of Panel Data. , 4th ed. Chichester , UK : Wiley .
  • Breusch , T. S. , Pagan , A. R. ( 1980 ). The Lagrange multiplier test and its applications to model misspecification in econometrics . Review of Economic Studies 47 : 239 – 253 .
  • Cochran , W. ( 1952 ). The chisquare test of goodness of fit . Annals of Mathematical Statistics 23 : 315 – 345 .
  • Davidson , R. , Flachaire , E. ( 2008 ). The wild bootstrap, tamed at last . Journal of Econometrics 146 : 162 – 169 .
  • Davidson , R. , MacKinnon , J. G. ( 1985 ). Heteroskedasticity-robust tests in regression directions. Annales de l'INSEE 59/60:183–218 .
  • Godfrey , L. G. (2009). Bootstrap Tests for Regression Models . Basingstoke , UK : Palgrave Macmillan.
  • Godfrey , L. G. , Orme , C. D. ( 2004 ). Controlling the finite sample significance levels of heteroskedasticity-robust tests of several linear restrictions on regression coefficients . Economics Letters 82 : 281 – 287 .
  • Godfrey , L. G. , Orme , C. D. , Santos-Silva , J. M. C. ( 2006 ). Simulation-based tests for heteroskedasticity in linear regression models: Some further results . Econometrics Journal 9 : 76 – 97 .
  • Goncalves , S. , Kilian , L. ( 2004 ). Bootstrapping autoregressions with conditional heteroskedasticity of unknown form . Journal of Econometrics 123 : 89 – 120 .
  • Häggström , E. , Laitila , T. ( 2002 ). Test of random subject effects in heteroskedastic linear models . Biometrical Journal 7 : 825 – 834 .
  • He , C. , Terasvirta , T. ( 1999 ). Fourth moment structure of the GARCH(p, q) process . Econometric Theory 15 : 824 – 846 .
  • Honda , Y. ( 1985 ). Testing the error components model with non-normal disturbances . Review of Economic Studies 52 : 681 – 690 .
  • Ling , S. , McAleer , M. ( 2002 ). Stationarity and the existence of moments of a family of GARCH processes . Journal of Econometrics 106 : 109 – 117 .
  • Mammen , E. ( 1993 ). Bootstrap and wild bootstrap for high dimensional linear models . Annals of Statistics 21 : 255 – 285 .
  • Nerlove , M. ( 1971 ). Further evidence on the estimation of dynamic economic relations from a time-series of cross-sections . Econometrica 39 : 359 – 382 .
  • Orme , C. D. , Yamagata , T. ( 2006 ). The asymptotic distribution of the F-test statistic for individual effects . Econometrics Journal 9 : 404 – 422 .
  • White , H. ( 2001 ). Asymptotic Theory for Econometricians . Revised Edition , San Diego , CA , USA : Academic Press .
  • Wooldridge , J. M. ( 2010 ). Econometric Analysis of Cross Section and Panel Data. , 2nd ed. Cambridge , MA , USA : MIT Press .

APPENDIX

In what follows denotes the Euclidean norm of a matrix C = {c ij }.

Proof of Lemma 1

Write , which are independent, so that and E[W i ] = 0, by Assumption A1(ii). Since , . Thus, by Minkowski's inequality and Assumption A3(i), for some η > 0,

so that . With Assumption A3(ii), a standard (Liapounov) Central Limit Theorem yields .

Proof of Proposition 1

The method of proof is nearly identical to that of (Orme and Yamagata (Citation2006), Proposition 1) but where, now, our assumptions allow for heteroskedasticity.

(i) Let S N  = (RSS R  − RSS U )/(N − 1) and K), so that

We first show that , so that (since is uniformly positive by Assumption A2(v)) . Following (Orme and Yamagata (Citation2006), Proof of Proposition 1), we can write

because , , and are all O p (1) and . Therefore,
because, by Assumption A2(i) and A1(ii), both terms inside the {·} above are o p (1). Thus, provided , (Equation22) yields
but from exactly the same argument employed by Orme and Yamagata (Citation2006, pp. 418–419) with

Thus, (Equation22) can be expressed as

(ii) By Lemma 1,

and the result follows. This completes the proof.

Proof of Proposition 2

1. First, for , by the Triangle Inequality, , since, as previously noted, and by the arguments of Orme and Yamagata (Citation2006, p. 422).

Second, for , from the proof of Lemma 1, we have that

Therefore, by the Triangle Inequality, it remains to show that . Since, , where , we can write

so that

Now, , and we shall show that so that, by Cauchy–Schwartz, then we are done.

Again by Cauchy–Schwartz, if it can be shown that (i) ; and (ii) , and we take each of these in turn:

(i) First, by repeated application of Cauchy–Schwartz, noting that ‖ A‖2 = T(T − 1),

Now, E‖ u i 4 is uniformly bounded, by Assumption A3(i), so by Markov's Inequality, , and it suffices to show that .

Now,

so that, by Cauchy–Schwartz, if , for m = 1, 2, 3. Clearly, , by Assumption A4(ii) and, by repeated use of Cauchy–Schwartz,
because , , by an application of Markov's Inequality, Cauchy–Schwartz, and Assumptions A2(ii) and A4(ii). Finally,
where and an application of Markov's Inequality, Minkowski's Inequality, Cauchy–Schwartz, and Assumption A2(ii) yields and . Thus, .

(ii) It immediately follows that , and we are done.

Third, for , by Assumption A3(i), and Minkowski's Inequality is uniformly bounded so that Thus, by the Triangle Inequality, it remains to show that Since , , we can write

Thus, by Cauchy–Schwartz, it suffices to show that It will be useful to note that

so that, now, it is sufficient to demonstrate that, , m = 1, 2, 3.

By Cauchy–Schwartz, we have

and

Both and are O p (1), by Markov's Inequality, Minkowski's Inequality, and Assumption A3(i). Thus, it suffices to show that and are both o p (1). The former is identical to , by the proof of 2(i), above, and the latter is o p (1) by Assumptions A2(ii) and A4(ii) and the consistency of . This completes the proof of part 3.

2. As in previous proofs, by Assumption A3(i) and the Triangle Inequality it suffices to show that

Again, since , , we can write

where , and it suffices to show that S Nm  = o p (1), m = 1, 2. Now,

Thus, since , it suffices to show that , or that since But this is true because

The first term on the right-hand side is o p (1) as are the latter two terms by an application of Cauchy–Schwartz.

Second,

by the preceding result, and this completes the proof.

Proof of Proposition 3

We can write , where and

By Proposition 1, it is sufficient to show that

and
and the result follows.

Establishing the former follows exactly the argument as in Orme and Yamagata (Citation2006, Proof of Proposition 2), and , was established above. This completes the proof.