196
Views
3
CrossRef citations to date
0
Altmetric
Research Article

Exact distribution of the F-statistic under heteroskedasticity of unknown form for improved inference

, , &
Pages 1782-1801 | Received 05 May 2020, Accepted 29 Dec 2020, Published online: 14 Jan 2021
 

Abstract

In this paper, we derive the exact finite sample distribution of the F (=t2) statistic for a single linear restriction on the regression parameters. We show that the F statistic can be expressed as a ratio of quadratic forms, and therefore its exact cumulative distribution under the null hypothesis can be derived from the result of Imhof [Computing the distribution of quadratic forms in normal variables. Biometrika. 1961;48(3/4):419–426]. A numerical calculation is carried out for the exact distribution of the F statistic using various HC covariance matrix estimators, and the rejection probability under the null hypothesis (size) based on the exact distribution is examined. The results show the exact finite sample distribution is remarkably reliable, while, in comparison, the use of the F-table leads to a serious over-rejection when the sample is not large or leveraged/unbalanced.

JEL Classifications:

Acknowledgements

The authors are grateful for the useful comments from the editor and three anonymous referees.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

Notes: As noted in Remark  3.1, the Infeasible Imhof distribution of the F statistic is Imhof(Ω^,NΩ,c) obtained from the cumulative distribution function in Equation (Equation15). The Benchmark distribution is computed from the cumulative distribution function in Equation (Equation21). In Tables , the Feasible Imhof distribution, Imhof(Ω^,NΩ^,c) obtained with NΩ^ estimated using the HC estimators in Section 2.

Notes: See Table .

Notes: See Table .

Notes: See Table .

Notes: See Table 

Notes: Reported are the F statistics testing for each of the six different single restrictions. Four different p-values of each F statistic are computed from using the Imhof distribution, the F-table, and two different ways of the wild bootstrap methods.

1 A related literature to the heteroskedasticity-consistent (HC) covariance matrix estimation of the regression coefficients is on the estimation of the heteroskedastic error variances themselves. Rao [Citation23] proposes the MInimum Norm Quadratic Unbiased Estimation (MINQUE) of the heteroskedastic error variances. Horn et al. [Citation16] proposes an ‘almost unbiased’ estimator as an estimator of heteroskedastic error variance, which MacKinnon and White [Citation5] uses to construct HC2 as appears below in Equation (Equation7). The variance estimator of Horn et al. [Citation16] is almost unbiased as its asymptotic bias is of order n1 when errors are heteroskedastic, and it is unbiased when errors are homoskedastic.

2 For applications of Imhof [Citation15] in a separate context of studying distributional properties of estimators, see Bao et al. [Citation24], Ullah [Citation25], Nakamura and Nakamura [Citation26], and Farebrother [Citation27], among others.

3 The accurate test size of using the exact Imhof distribution would also produce more accurate confidence intervals which can be computed from inverting the test statistics using the Imhof distribution.

4 The original HC3 in [Citation5, Equation 12] is slightly different. See also [Citation8, p.233].

5 The notation hr,r=1,,m, used in Equations (1.1) and (3.2) of [Citation15] is νj,j=1,,J in this paper. The values of δr and x in these Imhof's equations are both zero in this paper.

6 Note that L and HΩ1/2LΩ1/2=Ω1/2(N)Ω1/2 have the same characteristic equations and the same eigenvalues, because |LλI|=0 and |Ω1/2||LλI||Ω1/2|=0. As b is a scalar, H=Ω1/2X(XX)1rr(XX)1XΩ1/2/b is symmetric and idempotent. Hence, the eigenvalues of H and L are either 1's or 0's. Further, rank(H)=1 because H=ξξ/b with ξΩ1/2X(XX)1r being an n×1 vector. Therefore, only one of the eigenvalues of H or L, is 1 and all the other eigenvalues are 0. Then, (Equation19) and (Equation20) are simplified as θ(u)=12tan1(u)12cu and ρ(u)=(1+u2)14.

7 For yN(Xβ,Ω), r(XX)1XyrβN(0,b), where b=r(XX)1XΩX(XX)1r. Therefore, (r(XX)1Xyrβb)2χ12.

8 Unlike in simulation where we have the Benchmark in (Equation21) which can be used to evaluate the performance of the Imhof distribution, we do not have such a benchmark criterion in empirical applications. Thus we compare the feasible Imhof p-values with the wild bootstrap p-values. Many simulation studies suggested that the wild bootstrap gives good performance (e.g. [Citation28]), while some recent studies, e.g. [Citation29], suggested that alternative variants of the wild bootstrap may perform quite differently. Here, we use the wild bootstrap just to verify that the Imhof procedure works.

Additional information

Funding

This work was supported by the National Natural Science Foundation of China [grant number 71801184], Natural Science Foundation of Fujian Province [grant number 2018J01114] and the UCR Academic Senate for research funds.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.