1,897
Views
93
CrossRef citations to date
0
Altmetric
Comments

HAR Inference: Recommendations for Practice

, , &
Pages 541-559 | Received 01 Dec 2017, Published online: 02 Nov 2018
 

ABSTRACT

The classic papers by Newey and West (1987) and Andrews (1991) spurred a large body of work on how to improve heteroscedasticity- and autocorrelation-robust (HAR) inference in time series regression. This literature finds that using a larger-than-usual truncation parameter to estimate the long-run variance, combined with Kiefer-Vogelsang (2002, 2005) fixed-b critical values, can substantially reduce size distortions, at only a modest cost in (size-adjusted) power. Empirical practice, however, has not kept up. This article therefore draws on the post-Newey West/Andrews literature to make concrete recommendations for HAR inference. We derive truncation parameter rules that choose a point on the size-power tradeoff to minimize a loss function. If Newey-West tests are used, we recommend the truncation parameter rule S = 1.3T 1/2 and (nonstandard) fixed-b critical values. For tests of a single restriction, we find advantages to using the equal-weighted cosine (EWC) test, where the long run variance is estimated by projections onto Type II cosines, using ν = 0.4T 2/3 cosine terms; for this test, fixed-b critical values are, conveniently, tν or F. We assess these rules using first an ARMA/GARCH Monte Carlo design, then a dynamic factor model design estimated using a 207 quarterly U.S. macroeconomic time series.

ACKNOWLEDGMENTS

We thank Ulrich Müller, Mikkel Plagborg-Møller, Benedikt Pötscher, Yixiao Sun, Tim Vogelsang, and Ken West for helpful comments and/or discussions. Replication files are available at http://www.princeton.edu/∼mwatson/ .

Notes

1 The Newey-West estimator is implemented in standard econometric software, including Stata and EViews. Among undergraduate textbooks, Stock and Watson (Citation2015, eq (15.17)) recommends using the Newey-West estimator with the Andrews rule S = 0.75T 1/3. Wooldridge (Citation2012, sec. 12.5) recommends using the Newey-West estimator with either a rule of thumb for the bandwidth (he suggests S = 4 or 8 for quarterly data, not indexed to the sample size) or using either of the rules, S = 4(T/100)2/9 or S = T 1/4 (Wooldridge also discusses fixed-b asymptotics and the Kiefer-Vogelsang-Bunzel (Kiefer, Vogelsang, and Bunzel Citation2000) statistic). Both the Stock-Watson (Citation2015) and Wooldridge (Citation2012) rules yield S = 4 for T = 100 and S = 4 (second Wooldridge rule) or 5 (Stock-Watson and first Wooldridge rule) for T = 200 (rounded up). Dougherty (Citation2011) and Westhoff (Citation2013) recommend Newey-West standard errors but do not mention bandwidth choice. Hill, Griffiths, and Lim (Citation2018) and Hillmer and Hilmer (Citation2014) recommend Newey-West standard errors without discussing bandwidth choice, and their empirical examples, which use quarterly data, respectively set S = 4 and S = 1.

2 Criteria (i) and (ii) restrict our attention to kernel estimators and orthonormal series estimators of Ω, the class considered by LLS. Series estimators, also called orthogonal multitapers, are the sum of squared projections of Xtu^t on orthogonal series, see Brillinger (Citation1975), Grenander and Rosenblatt (Citation1957), Müller (Citation2004), Stoica and Moses (Citation2005), and Phillips (Citation2005) (Sun (Citation2013) discusses this class and provides additional references). This class includes subsample (batch-mean) estimators, the family considered in Ibragimov and Müller (Citation2010), see LLS for additional discussion. Our requirement of a known fixed-b asymptotic distribution rules out several families of LRV estimators: (a) parametric LRV estimators based on autoregressions (Berk Citation1974) or vector autoregressions (VARHAC; den Haan and Levin (Citation2000), Sun and Kaplan (Citation2014)); (b) kernel estimators in which parametric models are used to prewhiten the spectrum, followed by nonparametric estimation (e.g., Andrews and Monahan Citation1992); (c) methods that require conditional adjustments to be psd with probability one, e.g., Politis (Citation2011). Our focus on ease and speed of implementation in standard software leads us away from bootstrap methods (e.g. Gonçalves and Vogelsang 2011).

3 All tests considered in this article use integer values of the truncation parameter S and the degrees of freedom ν. Our rounding convention is to round up S for NW. For EWP/EWC, ν = T/S, so we round down ν.

4 If xt and ut are independent scalar AR(1)s with coefficients ρx and ρu , then zt has the autocovariances of an AR(1) with coefficient ρxρu . Throughout, we calibrate dependence through the dependence structure of zt , which is what enters Ω.

5 The theory of optimal testing aims to find the best test among those with the same size. To compare rejection rates under the alternative for two tests with different finite-sample sizes, an apples-to-apples comparison requires adjusting one or both of the tests so that they have the same rejection rate under the null. We elaborate on size adjustment in Section 3.

6 Phillips (Citation2005) showed that the Type II sine transform has the same mean and variance expansion as EWP. We expect that his calculations could be modified to cover Type II cosines.

7 Work on expansions in the GMM setting includes Inoue and Shintani (Citation2006), Sun and Phillips (Citation2009), and Sun (Citation2014b).

8 In general, the higher-order terms in Edgeworth expansions can result in power curves that cross, so that the ranking of the test depends on the alternative, see Rothenberg (Citation1984). Here, however, the higher order size-adjusted power depends on the kernel only through ν (see (Equation20)) so the power ranking of size-adjusted tests is the same against all alternatives. Thus using, say, an average power loss criterion, would not change the ranking of tests, although it would change the constant in (20).

9 The sampling scheme is described here as drawing from an infinite dimensional matrix. In practice it is implemented by randomly selecting a starting date, randomly selecting whether to start going forward or backwards in time, then obtaining the first T observations of the sequence η^τ,η^τ±1,η^τ±2,..., with the direction of time reversed upon hitting t = 1 or T.

10 As discussed by Hannan (Citation1958), Ng and Perron (Citation1996), Kiefer and Vogelsang (Citation2005), and Sun (Citation2014a), in the location model, estimating the mean produces downward bias in the NW LRV estimator. This bias is accounted for by fixed-b critical values in the case of the location model (and thus these terms do not appear in the higher-order expansions in LLS and this article), but it is not accounted for in the textbook NW case with standard critical values. This issue does not arise with EWP or EWC, or with QS if it is implemented in the frequency domain, because as discussed above these estimators are invariant to location shifts in zt .

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 123.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.