Exact distribution of the F-statistic under heteroskedasticity of unknown form for improved inference: Journal of Statistical Computation and Simulation: Vol 91 , No 9

Sample our Computer Science journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/00949655.2020.1871480?needAccess=true

Abstract

In this paper, we derive the exact finite sample distribution of the F ( $= t^{2}$ ) statistic for a single linear restriction on the regression parameters. We show that the F statistic can be expressed as a ratio of quadratic forms, and therefore its exact cumulative distribution under the null hypothesis can be derived from the result of Imhof [Computing the distribution of quadratic forms in normal variables. Biometrika. 1961;48(3/4):419–426]. A numerical calculation is carried out for the exact distribution of the F statistic using various HC covariance matrix estimators, and the rejection probability under the null hypothesis (size) based on the exact distribution is examined. The results show the exact finite sample distribution is remarkably reliable, while, in comparison, the use of the F-table leads to a serious over-rejection when the sample is not large or leveraged/unbalanced.

Keywords:

Heteroskedasticity
finite-sample theory
Imhof distribution

JEL Classifications:

Acknowledgements

The authors are grateful for the useful comments from the editor and three anonymous referees.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

Notes: As noted in Remark 3.1, the Infeasible Imhof distribution of the F statistic is $Imhof (\hat{Ω}, N Ω, c)$ obtained from the cumulative distribution function in Equation (Equation15(15) $Pr (F \leq c) = \frac{1}{2} - \frac{1}{π} \int_{0}^{\infty} \frac{\sin (\frac{1}{2} \sum_{j = 1}^{J} [\tan^{- 1} (λ_{j} u)])}{u \prod_{j = 1}^{J} (1 + λ_{j}^{2} u^{2})^{\frac{1}{4}}} d u \equiv Imhof (\hat{Ω}, N Ω, c) .$ (15) ). The Benchmark distribution is computed from the cumulative distribution function in Equation (Equation21(21) $Pr (F^{*} \leq c) = \frac{1}{2} - \frac{1}{π} \int_{0}^{\infty} \frac{\sin (\frac{1}{2} \sum_{j = 1}^{J} [\tan^{- 1} (λ_{j}^{*} u)] - \frac{1}{2} c u)}{u \prod_{j = 1}^{J} (1 + λ_{j}^{* 2} u^{2})^{\frac{1}{4}}} d u \equiv Benchmark (Ω, L, c) .$ (21) ). In Tables –, the Feasible Imhof distribution, $Imhof (\hat{Ω}, N \hat{Ω}, c)$ obtained with $N \hat{Ω}$ estimated using the HC estimators in Section 2.

Notes: See Table .

Notes: See Table

Notes: Reported are the F statistics testing for each of the six different single restrictions. Four different p-values of each F statistic are computed from using the Imhof distribution, the F-table, and two different ways of the wild bootstrap methods.

1 A related literature to the heteroskedasticity-consistent (HC) covariance matrix estimation of the regression coefficients is on the estimation of the heteroskedastic error variances themselves. Rao [Citation23] proposes the MInimum Norm Quadratic Unbiased Estimation (MINQUE) of the heteroskedastic error variances. Horn et al. [Citation16] proposes an ‘almost unbiased’ estimator as an estimator of heteroskedastic error variance, which MacKinnon and White [Citation5] uses to construct HC2 as appears below in Equation (Equation7(7) $\hat{Ω} = diag {\frac{e_{i}^{2}}{1 - h_{i i}}} .$ (7) ). The variance estimator of Horn et al. [Citation16] is almost unbiased as its asymptotic bias is of order $n^{- 1}$ when errors are heteroskedastic, and it is unbiased when errors are homoskedastic.

2 For applications of Imhof [Citation15] in a separate context of studying distributional properties of estimators, see Bao et al. [Citation24], Ullah [Citation25], Nakamura and Nakamura [Citation26], and Farebrother [Citation27], among others.

3 The accurate test size of using the exact Imhof distribution would also produce more accurate confidence intervals which can be computed from inverting the test statistics using the Imhof distribution.

4 The original HC3 in [Citation5, Equation 12] is slightly different. See also [Citation8, p.233].

5 The notation $h_{r}, r = 1, \dots, m$ , used in Equations (1.1) and (3.2) of [Citation15] is $ν_{j}, j = 1, \dots, J$ in this paper. The values of $δ_{r}$ and x in these Imhof's equations are both zero in this paper.

6 Note that $L$ and $H \equiv Ω^{1 / 2} {L Ω}^{- 1 / 2} = Ω^{1 / 2} (N^{*}) Ω^{1 / 2}$ have the same characteristic equations and the same eigenvalues, because $| L - λ I | = 0$ and $| Ω^{1 / 2} | | L - λ I | | Ω^{- 1 / 2} | = 0.$ As $b^{*}$ is a scalar, $H = Ω^{1 / 2} X (X^{'} X)^{- 1} r r^{'} (X^{'} X)^{- 1} X^{'} Ω^{1 / 2} / b^{*}$ is symmetric and idempotent. Hence, the eigenvalues of $H$ and $L$ are either 1's or 0's. Further, rank $(H) = 1$ because $H = ξ ξ^{'} / b^{*}$ with $ξ \equiv Ω^{1 / 2} X (X^{'} X)^{- 1} r$ being an $n \times 1$ vector. Therefore, only one of the eigenvalues of $H$ or $L$ , is 1 and all the other eigenvalues are 0. Then, (Equation19(19) $\begin{aligned} θ (u) & = \frac{1}{2} \sum_{j = 1}^{J} [ν_{j} \tan^{- 1} (λ_{j}^{*} u)] - \frac{1}{2} c u, \end{aligned}$ (19) ) and (Equation20(20) $\begin{aligned} ρ (u) & = \prod_{j = 1}^{J} (1 + λ_{j}^{* 2} u^{2})^{\frac{1}{4} ν_{j}}, \end{aligned}$ (20) ) are simplified as $θ (u) = \frac{1}{2} \tan^{- 1} (u) - \frac{1}{2} c u$ and $ρ (u) = (1 + u^{2})^{\frac{1}{4}}$ .

7 For $y \sim N (X β, Ω)$ , $r^{'} (X^{'} X)^{- 1} X^{'} y - r^{'} β \sim N (0, b^{*})$ , where $b^{*} = r^{'} (X^{'} X)^{- 1} X^{'} Ω X (X^{'} X)^{- 1} r$ . Therefore, ${(\frac{r^{'} (X^{'} X)^{- 1} X^{'} y - r^{'} β}{\sqrt{b^{*}}})}^{2} \sim χ_{1}^{2} .$

8 Unlike in simulation where we have the Benchmark in (Equation21(21) $Pr (F^{*} \leq c) = \frac{1}{2} - \frac{1}{π} \int_{0}^{\infty} \frac{\sin (\frac{1}{2} \sum_{j = 1}^{J} [\tan^{- 1} (λ_{j}^{*} u)] - \frac{1}{2} c u)}{u \prod_{j = 1}^{J} (1 + λ_{j}^{* 2} u^{2})^{\frac{1}{4}}} d u \equiv Benchmark (Ω, L, c) .$ (21) ) which can be used to evaluate the performance of the Imhof distribution, we do not have such a benchmark criterion in empirical applications. Thus we compare the feasible Imhof p-values with the wild bootstrap p-values. Many simulation studies suggested that the wild bootstrap gives good performance (e.g. [Citation28]), while some recent studies, e.g. [Citation29], suggested that alternative variants of the wild bootstrap may perform quite differently. Here, we use the wild bootstrap just to verify that the Imhof procedure works.

Rao CR. Estimation of heteroscedastic variances in linear models. J Am Stat Assoc. 1970;65(329):161–172.

Web of Science ®Google Scholar

Horn SD, Horn RA, Duncan DB. Estimating heteroscedastic variances in linear models. J Am Stat Assoc. 1975;70(350):380–385.

Web of Science ®Google Scholar

MacKinnon JG, White H. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. J Econom. 1985;29(3):305–325.

Web of Science ®Google Scholar

Horn SD, Horn RA, Duncan DB. Estimating heteroscedastic variances in linear models. J Am Stat Assoc. 1975;70(350):380–385.

Web of Science ®Google Scholar

Imhof JP. Computing the distribution of quadratic forms in normal variables. Biometrika. 1961;48(3–4):419–426.

Web of Science ®Google Scholar

Bao Y, Ullah A, Wang Y. Distribution of the mean reversion estimator in the OrnsteinUhlenbeck process. Econom Rev. 2017;36(6-9):1039–1056.

Web of Science ®Google Scholar

Ullah A. Finite sample econometrics. New York (NY): Oxford University Press; 2004.

Google Scholar

Nakamura A, Nakamura M. Model specification and endogeneity. J Econom. 1998;83(1–2):213–237.

Web of Science ®Google Scholar

Farebrother RW. Eigenvalue-free methods for computing the distribution of a quadratic form in normal variables. Statistische Hefte. 1985;26(1):287–302.

Google Scholar

MacKinnon JG, White H. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. J Econom. 1985;29(3):305–325.

Web of Science ®Google Scholar

Hausman J, Palmer C. Heteroskedasticity-robust inference in finite samples. Econ Lett. 2012;116(2):232–235.

Web of Science ®Google Scholar

Imhof JP. Computing the distribution of quadratic forms in normal variables. Biometrika. 1961;48(3–4):419–426.

Web of Science ®Google Scholar

Flachaire E. Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap. Comput Stat Data Anal. 2005;49(2):361–376.

Web of Science ®Google Scholar

Djogbenou AA, MacKinnon JG, Nielsen MO. Asymptotic theory and wild bootstrap inference with clustered errors. J Econom. 2019;212(2):393–412.

Web of Science ®Google Scholar

Additional information

Funding

This work was supported by the National Natural Science Foundation of China [grant number 71801184], Natural Science Foundation of Fujian Province [grant number 2018J01114] and the UCR Academic Senate for research funds.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 1,209.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

Exact distribution of the F-statistic under heteroskedasticity of unknown form for improved inference

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Exact distribution of the F-statistic under heteroskedasticity of unknown form for improved inference

Abstract

Acknowledgements

Disclosure statement

Notes

Additional information

Funding

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature