1,537
Views
1
CrossRef citations to date
0
Altmetric
Theory and Methods

Bootstrap Inference in the Presence of Bias

Received 03 Aug 2022, Accepted 10 Nov 2023, Published online: 09 Jan 2024

Abstract

We consider bootstrap inference for estimators which are (asymptotically) biased. We show that, even when the bias term cannot be consistently estimated, valid inference can be obtained by proper implementations of the bootstrap. Specifically, we show that the prepivoting approach of Beran, originally proposed to deliver higher-order refinements, restores bootstrap validity by transforming the original bootstrap p-value into an asymptotically uniform random variable. We propose two different implementations of prepivoting (plug-in and double bootstrap), and provide general high-level conditions that imply validity of bootstrap inference. To illustrate the practical relevance and implementation of our results, we discuss five examples: (i) inference on a target parameter based on model averaging; (ii) ridge-type regularized estimators; (iii) nonparametric regression; (iv) a location model for infinite variance data; and (v) dynamic panel data models. Supplementary materials for this article are available online.

1 Introduction

Suppose that θ is a scalar parameter of interest and let θ̂n denote an estimator for which (1.1) Tn:=g(n)(θ̂nθ)dB+ξ1,(1.1) where g(n) is the rate of convergence of θ̂n, ξ1 is a continuous random variable centered at zero, and B is an asymptotic bias (our theory in fact allows for a more general formulation of the bias). A typical example is g(n)=n1/2 and ξ1N(0,σ2). Unless B can be consistently estimated, which is often difficult or impossible, classic (first-order) asymptotic inference on θ based on quantiles of ξ1 in (1.1) is not feasible. Furthermore, the bootstrap, which is well known to deliver asymptotic refinements over first-order asymptotic approximations as well as bias corrections (Hall Citation1992; Horowitz Citation2001; Cattaneo and Jansson Citation2018, Citation2022; Cattaneo, Jansson, and Ma Citation2019), cannot in general be applied to solve the asymptotic bias problem when a consistent estimator of B does not exist. Examples are given below.

Our goal is to justify bootstrap inference based on Tn in the context of asymptotically biased estimators and where a consistent estimator of B does not exist. Consider the bootstrap statistic Tn:=g(n)(θ̂nθ̂n), where θ̂n is a bootstrap version of θ̂n, such that (1.2) TnB̂ndpξ1,(1.2) where B̂n is the implicit bootstrap bias, and “dp” denotes weak convergence in probability (defined below). When B̂nB=op(1), the bootstrap is asymptotically valid in the usual sense that the bootstrap distribution of Tn is consistent for the asymptotic distribution of Tn, that is, supxR|P(Tnx)P(Tnx)|=op(1).

We consider situations where B̂nB is not asymptotically negligible so the bootstrap fails to replicate the asymptotic bias. For example, this happens when the asymptotic bias term in the bootstrap world includes a random (additive) component, that is (1.3) B̂nBdξ2(jointly with (1.1)),(1.3) where ξ2 is a random variable centered at zero. In this case, the bootstrap distribution is random in the limit and hence cannot mimic the asymptotic distribution given in (1.1). Moreover, the distribution of the bootstrap p-value, p̂n:=P(TnTn), is not asymptotically uniform, and the bootstrap cannot in general deliver hypothesis tests (or confidence intervals) with the desired null rejection probability (or coverage probability).

In this article, we show that in this nonstandard case valid inference can successfully be restored by proper implementation of the bootstrap. This is done by focusing on properties of the bootstrap p-value rather than on the bootstrap as a means of estimating limiting distributions, which is infeasible due to the asymptotic bias. In particular, we show that such implementations lead to bootstrap inferences that are valid in the sense that they provide asymptotically uniformly distributed p-values.

Our inference strategy is based on the fact that, for some bootstrap schemes, the large-sample distribution of the bootstrap p-value, say H(u), u[0,1], although not uniform, does not depend on B. That is, we can search for bootstrap algorithms which generate bootstrap p-values that, in large samples, are not affected by unknown bias terms. When this is possible, we can make use of the prepivoting approach of Beran (Citation1987, Citation1988), which—as we will show in this article—allows to restore bootstrap validity. Specifically, our proposed modified p-value is defined as p˜n:=Ĥn(p̂n),where Ĥn(u) is any consistent estimator of H(u), uniformly over u[0,1]. The (asymptotic) probability integral transform p̂nH(p̂n), continuity of H(u), and consistency of Ĥn(u) then guarantee that p˜n is asymptotically uniformly distributed. Interestingly, Beran (Citation1987, Citation1988) proposed this approach to obtain asymptotic refinements for the bootstrap, but did not consider asymptotically biased estimators as we do here.

We propose two approaches to estimating H. First, if H=Hγ, where γ is a finite-dimensional parameter vector, and a consistent estimator γ̂n of γ is available, then a “plug-in” approach setting Ĥn=Hγ̂n can deliver asymptotically uniform p-values. Second, if estimation of γ is difficult (e.g., when γ does not have a closed form expression), we can use a “double bootstrap” scheme (Efron Citation1983; Hall Citation1986), where estimation of H is achieved by resampling from the bootstrap data originated in the first level.

For both methods, we provide general high-level conditions that imply validity of the proposed approach. Our conditions are not specific to a given bootstrap method; rather, they can in principle be applied to any bootstrap scheme satisfying the proposed sufficient conditions for asymptotic validity.

Our approach is related to recent work by Shao and Politis (Citation2013) and Cavaliere and Georgiev (Citation2020). In particular, a common feature is that the distribution function of the bootstrap statistic, conditional on the original data, is random in the limit. Cavaliere and Georgiev (Citation2020) emphasize that randomness of the limiting bootstrap measure does not prevent the bootstrap from delivering an asymptotically uniform p-value (bootstrap “unconditional” validity), and provide results to assess such asymptotic uniformity. Our context is different, since the presence of an asymptotic bias term renders the distribution of the bootstrap p-value nonuniform, even asymptotically. In this respect, our work is related to Shao and Politis (Citation2013), who show that t-statistics based on subsampling or block bootstrap methods with bandwidth proportional to sample size may deliver non-uniformly distributed p-values that, however, can be estimated.

To illustrate the practical relevance of our results and to show how to implement them in applied problems, we consider three examples involving estimators that feature an asymptotic bias term. In the first two examples (model averaging and ridge regression), B is not consistently estimable due to the presence of local-to-zero parameters and the standard bootstrap fails. In the third example (nonparametric regression), the bootstrap fails because B depends on the second-order derivative of the conditional mean function, whose estimation requires the use of a different (suboptimal) bandwidth. In these examples, ξ1 is normal, but g(n) and B are example-specific. Two additional examples are presented in the supplement. The fourth example is a simple location model without the assumption of finite variance, where ξ1 is not normal and estimators converge at an unknown rate. The fifth example considers inference for dynamic panel data models, where B is the incidental parameter bias.

The remainder of the article is organized as follows. In Section 2 we introduce our three leading examples. Section 3 contains our general results, which we apply to the three examples in Section 4. Section 5 concludes. The supplemental material contains two appendices. Appendix A specializes the general theory to the case of asymptotically Gaussian statistics, and Appendix B contains details and proofs for the three leading examples, as well as two additional examples.

Notation

Throughout this article, the notation ∼ indicates equality in distribution. For instance, ZN(0,1) means that Z is distributed as a standard normal random variable. We write “x:=y” and “y=:x” to mean that x is defined by y. The standard Gaussian cumulative distribution function (cdf) is denoted by Φ; U[0,1] is the uniform distribution on [0,1], and I{·} is the indicator function. If F is a cdf, F1 denotes the generalized inverse, that is, the quantile function, F1(u):=inf{vR:F(v)u},uR. Unless specified otherwise, all limits are for n. For matrices a, b, c with n rows, we let Sab:=ab/n and Sab.c:=SabSacScc1Scb, assuming that Scc has full rank.

For a (single level or first-level) bootstrap sequence, say Yn, we use Ynpp0, or equivalently Ynp0, in probability, to mean that, for any ϵ>0,P(|Yn|>ϵ)p0, where P denotes the probability measure conditional on the original data Dn. An equivalent notation is Yn=op(1) (where we omit the qualification “in probability” for brevity). Similarly, for a double (or second-level) bootstrap sequence, say Yn, we write Yn=op(1) to mean that for all ϵ>0,P(|Yn|>ϵ)pp0, where P is the probability measure conditional on the first-level bootstrap data Dn and on Dn.

We use Yndpξ, or equivalently Yndξ, in probability, to mean that, for all continuity points uR of the cdf of ξ, say G(u):=P(ξu), it holds that P(Ynu)G(u)p0. Similarly, for a double bootstrap sequence Yn, we use Yndpξ, in probability, to mean that P(Ynu)G(u)pp0 for all continuity points u of G.

2 Examples

In this section we introduce our three leading examples. Example-specific regularity conditions, formally stated results, and additional definitions are given in Appendix B. For each of these examples, we argue that (1.1), (1.2), and (1.3) hold, such that the bootstrap p-values p̂n are not uniformly distributed rendering standard bootstrap inference invalid. We then return to each example in Section 4, where we discuss how to implement our proposed method and prove its validity.

2.1 Inference after Model Averaging

Setup. We consider inference based on a model averaging estimator obtained as a weighted average of least squares estimators (Hansen Citation2007). Assume that data are generated according to the linear model (2.1) y=xβ+Zδ+ε,(2.1) where β is the (scalar) parameter of interest and ε is an n-vector of identically and independently distributed random variables with mean zero and variance σ2 (henceforth iid (0,σ2)), conditional on W:=(x,Z).

The researcher fits a set of M models, each of them based on different exclusion restrictions on the q-dimensional vector δ. This setup allows for model averaging both explicitly and implicitly. The former follows, for example, Hansen (Citation2007). The latter includes the common practice of robustness checks in applied research, where the significance of a target coefficient is evaluated through an (often informal) assessment of its significance across a set of regressions based on different sets of controls; see Oster (Citation2019) and the references therein. Specifically, letting Rm denote a q×qm selection matrix, the mth model includes x and Zm:=ZRm as regressors, and the corresponding OLS estimator of β is β˜m,n:=Sxx.Zm1Sxy.Zm. Given a set of fixed weights ω:=(ω1,,ωM) such that ωm[0,1] and m=1Mωm=1, the model averaging estimator is β˜n:=m=1Mωmβ˜m,n. Then Tn:=n1/2(β˜nβ) satisfies TnBndξ1N(0,v2), where v2>0 and Bn:=Qnn1/2δ,Qn:=m = 1MωmSxx.Zm1SxZ.Zm.

Thus, the magnitude of the asymptotic bias Bn depends on n1/2δ. If δ is local to zero in the sense that δ=cn1/2 for some vector cRq (as in, e.g., Hjort and Claeskens Citation2003; Liu Citation2015; Hounyo and Lahiri Citation2023), then BnpB:=Qc with Q:=*plimQn, so that (1.1) is satisfied with nonzero B in general. Because B depends on c, which is not consistently estimable, we cannot obtain valid inference from a Gaussian distribution based on sample analogues of B and v2.

Fixed regressor bootstrap. We generate the bootstrap sample as y=xβ̂n+Zδ̂n+ε, where ε|DnN(0,σ̂n2In),(β̂n,δ̂n,σ̂n2) is the OLS estimator from the full model, and Dn={y,W}. Similar results can be established for the nonparametric bootstrap where ε is resampled from the full model residuals. The bootstrap model averaging estimator is given by β˜n:=m=1Mωmβ˜m,n, where β˜m,n:=Sxx.Zm1Sxy.Zm. Letting Tn:=n1/2(β˜nβ̂n), we can show that (1.2) holds with B̂n:=Qnn1/2δ̂n such that, as in (1.3), B̂nBn=Qnn1/2(δ̂nδ)dξ2N(0,v22),v22>0,given in particular the asymptotic normality of n1/2(δ̂nδ). Because the bias term in the bootstrap world is random in the limit, the conditional distribution of Tn is also random in the limit, and in particular does not mimic the asymptotic distribution of the original statistic Tn.

Pairs bootstrap. Consider now a pairs (random design) bootstrap sample {yt,xt,zt;t=1,,n}, based on resampling with replacement from the tuples {yt,xt,zt;t=1,,n}. As is standard, it is useful to recall that the bootstrap data have the representation y=xβ̂n+Zδ̂n+ε,where ε=(ε1,,εn) and εt is an iid draw from ε̂t=ytxtβ̂nztδ̂n. The pairs bootstrap model averaging estimator is β˜n:=m=1Mωmβ˜m,n with β˜m,n:=Sxx.Zm1Sxy.Zm and Zm=ZRm. The pairs bootstrap statistic is then Tn:=n1/2(β˜nβ̂n)=Bn+n1/2Sxx1Sxε, where Bn:=m=1MωmSxx.Zm1SxZ.Zmn1/2δ̂n.

Therefore, and in contrast with the fixed regressor bootstrap (FRB), the term Bn is stochastic under the bootstrap probability measure and replaces the bias term  B̂n. This difference is not innocuous because it implies that TnB̂n no longer replicates the asymptotic distribution of TnBn and (1.2) does not hold. However, this does not prevent our method from working, but it will require a different set of conditions which we will give in Section 3.5.

2.2 Ridge Regression

Setup. We consider estimation of a vector of regression parameters through regularization; in particular, by using a ridge estimator. The model is yt=θxt+εt,t=1,,n, where xt is a p×1 non-stochastic vector and εt  iid (0,σ2). Interest is on testing H0:gθ=r, based on ridge estimation of θ. Specifically, the ridge estimator has closed form expression θ˜n=S˜xx1Sxy, where S˜xx:=Sxx+n1cnIp and cn is a tuning parameter that controls the degree of shrinkage toward zero. Clearly, cn = 0 corresponds to the OLS estimator, θ̂n. We are interested in the case where the regressors have limited explanatory power, that is, where θ=δn1/2 is local to zero, which can in fact be taken as a motivation for shrinkage toward zero and hence for ridge estimation. To test H0, we consider the test statistic Tn:=n1/2(gθ˜nr). If n1cnc00 (as in, e.g., Fu and Knight Citation2000) then, under the null, it holds that TnBndξ1N(0,v2), where Bn:=cnn1/2gS˜xx1θ=cnn1gS˜xx1δB:=c0gΣ˜xx1δwith Σ˜xx:=Σxx+c0Ip and Σxx:=limSxx. Hence, for c0>0,θ˜n is asymptotically biased and the bias term cannot be consistently estimated. Consequently, (1.1) is satisfied, and inference based on the quantiles of the N(0,v2) distribution is invalid unless c0=0.

Bootstrap. Consider a pairs (random design) bootstrap sample {yt,xt;t=1,,n} built by iid resampling from the tuples {yt,xt;t=1,,n}. The bootstrap analogue of the ridge estimator is θ˜n:=S˜xx1Sxy, where S˜xx:=Sxx+n1cnIp. The bootstrap statistic is Tn:=n1/2g(θ˜nθ̂n), which is centered using θ̂n to guarantee that εt and xt are uncorrelated in the bootstrap world. Because we have used a pairs bootstrap, we now have TnBndpξ1 for Bn:=cnn1/2gS˜xx1θ̂n. However, BnB̂n=op(1) with B̂n:=cnn1/2gS˜xx1θ̂n, such that TnB̂n still satisfies (1.2). Then (1.3) holds with B̂nBn=cnn1gS˜xx1n1/2(θ̂nθ)dξ2N(0,v22),v22>0,so the bootstrap fails to approximate the asymptotic distribution of Tn (see also Chatterjee and Lahiri Citation2010, Citation2011).

2.3 Nonparametric Regression

Setup. Consider the model (2.2) yt=β(xt)+εt,t=1,,n,(2.2) where β(·) is a smooth function and εt iid (0,σ2). For simplicity, we consider a fixed-design model; that is, xt=t/n. The goal is inference on β(x) for a fixed x(0,1). We apply the standard Nadaraya-Watson (fixed-design) estimator β̂h(x):=(nh)1t=1nK((xtx)/h)yt, where h=cn1/5 for some c > 0 is the MSE-optimal bandwidth and K is the kernel function. We do not consider the more general local polynomial regression case, although we conjecture that very similar results will hold. We leave that case for future research. The statistic Tn:=(nh)1/2(β̂h(x)β(x)) satisfies TnBndξ1N(0,v2), where v2:=σ2K(u)2du>0 and (2.3) Bn:=(nh)1/2(1nht=1nktβ(xt)β(x))(2.3) with kt:=K((xtx)/h). The bias Bn satisfies (2.4) Bn=(nh)1/2(h2β(x)κ2/2+o(h2))B:=c5/2β(x)κ2/2,(2.4) where κ2:=u2K(u)du and β(x) denotes the second-order derivative of β(x). Thus, (1.1) is satisfied. Estimating B or Bn is challenging because it involves estimating β(x), and although theoretically valid estimators exist, they perform poorly in finite samples. This issue is pointed out by Calonico, Cattaneo, and Titunik (Citation2014) and Calonico, Cattaneo, and Farrell (Citation2018), who propose more accurate bias correction techniques specifically for regression discontinuity designs and nonparametric curve estimation.

Bootstrap. The (parametric) bootstrap sample is generated as yt=β̂h(xt)+εt,t=1,,n, where εt|Dn iid N(0,σ̂n2) with Dn={yt;t=1,,n} and σ̂n2 denotes a consistent estimator of σ2; for example the residual variance. Let β̂h(x):=(nh)1t=1nktyt and Tn:=(nh)1/2(β̂h(x)β̂h(x)). Then (1.2) is satisfied with B̂n:=(nh)1/2(1nht=1nktβ̂h(xt)β̂h(x)).

Because h=cn1/5, (1.3) holds with B̂nBn=(nh)1/2(1nht=1nkt(β̂h(xt)β(xt))(β̂h(x)β(x)))dξ2N(0,v22),where v22>0, so the bootstrap is invalid. Two possible solutions to this problem are to generate the bootstrap sample as yt=β̂g(xt)+εt, where g is an oversmoothing bandwidth satisfying ng5 (e.g., Härdle and Marron Citation1991) or to center the bootstrap statistic at its expected value and add a consistent estimator of B (e.g., Härdle and Bowman Citation1988; Eubank and Speckman Citation1993). Both approaches require selecting two bandwidths, which is not straightforward. An alternative approach, suggested by Hall and Horowitz (Citation2013), focuses on an asymptotic theory-based confidence interval and applies the bootstrap to calibrate its coverage probability. However, this requires an additional averaging step across a grid of x (their step 6) to asymptotically eliminate ξ2, and it results in an asymptotically conservative interval. Finally, a non-bootstrap-based solution is undersmoothing using a bandwidth h satisfying nh50, although of course that is not MSE-optimal and may result in trivial power against certain local alternatives; see Section 4.3.

3 General Results

3.1 Framework and Invalidity of the Standard Bootstrap

The general framework is as follows. We have a statistic Tn defined as a general function of a sample Dn, for which we would like to compute a valid bootstrap p-value. Usually, Tn is a test statistic or a (possibly normalized) parameter estimator; for example, Tn=g(n)(θ̂nθ0). Let Dn denote the bootstrap sample, which depends on the original data and on some auxiliary bootstrap variates (which we assume defined jointly with Dn on a possibly extended probability space). Let Tn denote the bootstrap version of Tn computed on Dn; for example, Tn=g(n)(θ̂nθ̂n). Let L̂n(u):=P(Tnu), uR, denote its distribution function, conditional on the original data. The bootstrap p-value is defined as p̂n:=P(TnTn)=L̂n(Tn).

First-order asymptotic validity of p̂n requires that p̂n converges in distribution to a standard uniform distribution; that is, that p̂ndU[0,1]. In this section we focus on a class of statistics Tn and Tn for which this condition is not necessarily satisfied. The main reason is the presence of an additive “bias” term Bn that contaminates the distribution of Tn and cannot be replicated by the bootstrap distribution of Tn.

Assumption 1.

TnBndξ1, where ξ1 is centered at zero and the cdf G(u)=P(ξ1u) is continuous and strictly increasing over its support.

When Bn converges to a nonzero constant B, Assumption 1 can be written TndB+ξ1 as in (1.1). If Tn is a normalized version of a (scalar) parameter estimator, that is, Tn=g(n)(θ̂nθ0), then we can think of B as the asymptotic bias of θ̂n because ξ1 is centered at zero. Although we allow for the possibility that Bn does not have a limit (and it may even diverge), we will still refer to Bn as a “bias term”. More generally, in Assumption 1 we cover any statistic Tn that is not necessarily Gaussian (even asymptotically) and whose limiting distribution is G only after we subtract the sequence Bn. The limiting distribution G may depend on a parameter such that TnBn is not an asymptotic pivot.

Inference based on the asymptotic distribution of Tn requires estimating Bn and any parameter in G. Alternatively, we can use the bootstrap to bypass parameter estimation and directly compute a bootstrap p-value that relies on Tn and Tn alone; that is, we consider p̂n:=P(TnTn). A set of high-level conditions on Tn and Tn that allow us to derive the asymptotic properties of this p-value are described next.

Assumption 2.

For some Dn-measurable random variable B̂n, it holds that: (i) TnB̂ndpξ1, where ξ1 is described in Assumption 1; (ii) (TnBnB̂nBn)d(ξ1ξ2),where ξ2 is centered at zero and F(u)=P(ξ1ξ2u) is a continuous cdf.

Assumption 2(i) states that TnB̂n converges in distribution to a random variable ξ1 having the same distribution function G as TnBn.Footnote1 Thus, B̂n can be thought of as an implicit bootstrap bias that affects the statistic Tn, in the same way that Bn affects the original statistic Tn. Assumption 2(ii) complements Assumption 1 by requiring the joint convergence of TnBn and B̂nBn towards ξ1 and ξ2, respectively; see also (1.1)–(1.3).

Given Assumption 2(i), we could use the bootstrap distribution of TnB̂n to approximate the distribution of TnBn. Since Bn is typically unknown, this result is not very useful for inference unless B̂n is consistent for Bn. In this case, Assumption 2 together with Assumption 1 imply that p̂n is asymptotically distributed as U[0,1]. This follows by noting that if B̂nBn=op(1), then ξ2=0 a.s., implying that F(u)=G(u). Consequently, p̂n:=P(TnTn)=P(TnB̂nTnB̂n)=G(TnB̂n)+op(1)(by Assumption 2(i))dG(ξ1ξ2)(by Assumption 2(ii) and continuity of G)U[0,1],where the last distributional equality holds by F = G and the probability integral transform. However, this result does not hold if B̂nBn does not converge to zero in probability. Specifically, if B̂nBndξ2 (jointly with TnBndξ1), then TnB̂n=(TnBn)(B̂nBn)dξ1ξ2F1(U[0,1]) under Assumptions 1 and 2(ii). When ξ2 is nondegenerate, FG, implying that p̂n=G(TnB̂n)+op(1) is not asymptotically distributed as a standard uniform random variable. This result is summarized in the following theorem.

Theorem 3.1.

Suppose Assumptions 1 and 2 hold. Then p̂ndG(F1(U[0,1])).

Proof.

First notice that p̂n and G(TnB̂n) have the same asymptotic distribution because Assumption 2(i) and continuity of G imply that, by Polya’s theorem, |p̂nG(TnB̂n)|supuR|P(TnB̂nu)G(u)|p0.

Next, by Assumption 2(ii), TnB̂ndξ1ξ2, such that G(TnB̂n)dG(ξ1ξ2)by continuity of G and the continuous mapping theorem. Since ξ1ξ2 has continuous cdf F, it holds that ξ1ξ2F1(U[0,1]), which completes the proof. □

Remark 3.1.

The value of B̂n in Assumption 2(i) depends on the chosen bootstrap algorithm. It is possible that B̂np0 for some bootstrap algorithms; examples are given in Remark B.2 and Appendix B.5. If this is the case, then ξ2=B a.s., which implies that F(u):=P(ξ1ξ2u)=P(ξ1uB)=G(uB),

and hence Assumption 2(ii) is not satisfied. In this case the bootstrap p-value satisfies p̂ndG(G1(U[0,1])+B).

Note that this distribution is uniform only if B = 0. Hence, the p-value depends on B, even in the limit.

Remark 3.2.

Under Assumptions 1 and 2, standard bootstrap (percentile) confidence sets are also in general invalid. Consider, for example, the case where Tn=g(n)(θ̂nθ0) and Tn is its bootstrap analogue with (conditional) distribution function L̂n(u). A right-sided confidence set for θ0 at nominal confidence level 1α(0,1) can be obtained as (e.g., Horowitz Citation2001, p. 3171) CIn1α:=[θ̂ng(n)1q̂n(1α),+), where q̂n(1α):=L̂n1(1α). Then P(θ0CIn1α)=P(θ̂ng(n)1q̂n(1α)θ0)=P(Tnq̂n(1α))=P(L̂n(Tn)1α)=P(p̂n1α)1αbecause, by Theorem 3.1, p̂n is not asymptotically uniformly distributed.

Remark 3.3.

It is worth noting that, under Assumptions 1 and 2, the bootstrap (conditional) distribution is random in the limit whenever ξ2 is nondegenerate. Specifically, assume for simplicity that BnpB. Recall that L̂n(u):=P(Tnu), uR, and let Ĝn(u):=P(TnB̂nu). It then holds that L̂n(u)=Ĝn(uB̂n)=G(uB(B̂nB))+ân(u),where ân(u)supuR|Ĝn(u)G(u)|=op(1) by Assumption 2(i), continuity of G, and Polya’s theorem. Because B̂nBdξ2, it follows that when ξ2 is nondegenerate, L̂n(u)wG(uBξ2), where w denotes weak convergence of cdf’s as (random) elements of a function space (see Cavaliere and Georgiev Citation2020). The presence of ξ2 in G(uBξ2) makes this a random cdf.Footnote2 Therefore, the bootstrap is unable to mimic the asymptotic distribution of Tn, which is G(uB) by Assumption 1.

Next, we describe two possible solutions to the invalidity of the standard bootstrap p-value p̂n. One relies on the prepivoting approach of Beran (Citation1987, Citation1988); see Section 3.2. The basic idea is that we modify p̂n by applying the mapping p̂nH(p̂n), where H(u) is the asymptotic cdf of p̂n, which makes the modified p-value H(p̂n) asymptotically standard uniform. Contrary to Beran (Citation1987, Citation1988), who proposed prepivoting as a way of providing asymptotic refinements for the bootstrap, here we show how to use prepivoting to solve the invalidity of the standard bootstrap p-value p̂n. This result is new in the bootstrap literature. The second approach relies on computing a standard bootstrap p-value based on the modified statistic given by TnB̂n; see Section 3.4. Thus, we modify the test statistic rather than modifying the way we compute the bootstrap p-value.

3.2 Prepivoting

Theorem 3.1 implies that P(p̂nu)P(G(F1(U[0,1]))u)=P(U[0,1]F(G1(u)))=F(G1(u))=:H(u)uniformly over u[0,1] by Polya’s theorem, given the continuity of G and F. Although H is not the uniform distribution, unless G = F, it is continuous because G is strictly increasing. Thus, the following corollary to Theorem 3.1 holds by the probability integral transform.

Corollary 3.1.

Under the conditions of Theorem 3.1, H(p̂n)dU[0,1].

Therefore, the mapping of p̂n into H(p̂n) transforms p̂n into a new p-value, H(p̂n), whose asymptotic distribution is the standard uniform distribution on [0,1]. Inference based on H(p̂n) is generally infeasible, because we do not observe H(u). However, if we can replace H(u) with a uniformly consistent estimator Ĥn(u) then this approach will deliver a feasible modified p-value p˜n:=Ĥn(p̂n). Since the limit distribution of p˜n is the standard uniform distribution, p˜n is an asymptotically valid p-value. The mapping of p̂n into p˜n=Ĥn(p̂n) by the estimated distribution of the former corresponds to what Beran (Citation1987) calls “prepivoting.” In the following sections, we describe two methods of obtaining a consistent estimator of H(u).

Remark 3.4.

The prepivoting approach can also be used to solve the invalidity of confidence sets based on the standard bootstrap; see Remark 3.2. In particular, replace the nominal level 1α by Ĥn1(1α) and consider CI˜n1α:=[θ̂ng(n)1q̂n(Ĥn1(1α)),+). Then P(θ0CI˜n1α)=P(p̂nĤn1(1α))=P(Ĥn(p̂n)1α)1α,where the last convergence is implied by Corollary 3.1 and consistency of Ĥn.

Remark 3.5.

Corollary 3.1 can also be applied to right-tailed or two-tailed tests. The right-tailed p-value, say p̂n,r:=P(Tn>Tn)=1L̂n(Tn)=1p̂n, has cdf P(p̂n,ru)=P(p̂n1u)=1P(p̂n<1u)=1H(1u)+o(1) uniformly in u. Note that, because the conditional cdf of Tn is continuous in the limit, the p-value p̂n,r is asymptotically equivalent to P(TnTn). Thus, by Corollary 3.1, the modified right-tailed p-value, p˜n,r:=1Ĥn(p̂n,r), satisfies p˜n,r=1H(1p̂n,r)+op(1)=1H(p̂n)+op(1)dU[0,1].

Similarly, for two-tailed tests the equal-tailed bootstrap p-value, p˜n,et:=2min{p˜n,p˜n,r}=2min{p˜n,1p˜n}, satisfies p˜n,etdU[0,1] by Corollary 3.1 and the continuous mapping theorem.

3.2.1 Plug-in Approach

Suppose H(u)=Hγ(u) depends on a finite-dimensional parameter, γ. In view of Theorem 3.1, a simple approach to estimating H(u) is to use Ĥn(u)=Hγ̂n(u),where γ̂n denotes a consistent estimator of γ. This leads to a plug-in modified p-value defined as p˜n=Hγ̂n(p̂n).

By consistency of γ̂n and under the assumption that Hγ is continuous in γ, it follows immediately that p˜n=H(p̂n)+op(1)dF(G1(G(F1(U[0,1]))))=U[0,1].

This result is summarized next.

Corollary 3.2.

Let Assumptions 1 and 2 hold, and suppose Hγ(u) is continuous in (γ,u). If γ̂npγ then p˜n=Hγ̂n(p̂n)dU[0,1].

The plug-in approach relies on a consistent estimator of the asymptotic distribution H, but does not require estimating the “bias term” Bn. When estimating γ is simple, this approach is attractive since it does not require any double resampling. Examples are given in Section 4. However, computation of γ is case-specific and may be cumbersome in practice. An automatic approach is to use the bootstrap to estimate H(u), as we describe next.

3.2.2 Double Bootstrap

Following Beran (Citation1987, Citation1988), we can estimate H(u) with the bootstrap. That is, we let Ĥn(u)=P(p̂nu),where p̂n is the bootstrap analogue of p̂n. Since p̂n is itself a bootstrap p-value, computing p̂n requires a double bootstrap. In particular, let Dn denote a further bootstrap sample of size n based on Dn and some additional bootstrap variates (defined jointly with Dn and Dn on a possibly extended probability space), and let Tn denote the bootstrap version of Tn computed on Dn. With this notation, the second-level bootstrap p-value is defined as p̂n:=P(TnTn), where P denotes the bootstrap probability measure conditional on Dn and Dn (making p̂na function of Dn and Dn). This leads to a double bootstrap modified p-value, as given by p˜n:=Ĥn(p̂n)=P(p̂np̂n).

In order to show that p˜n=Ĥn(p̂n)dU[0,1], we add the following assumption.

Assumption 3.

Let ξ1 and ξ2 be as defined in Assumptions 1 and 2. For some (Dn,Dn)-measurable random variable B̂n, it holds that: (i) TnB̂ndpξ1, in probability, and (ii) TnB̂ndpξ1ξ2.

Assumption 3 complements Assumptions 1 and 2 by imposing high-level conditions on the second-level bootstrap statistics. Specifically, Assumption 3(i) assumes that Tn has asymptotic distribution G only after we subtract B̂n. This term is the second-level bootstrap analogue of B̂n. It depends only on the first-level bootstrap data Dn and is not random under P. The second part of Assumption 3 follows from Assumption 2 in the special case that B̂nB̂n=op(1), in probability; that is, when ξ2=0 a.s., implying F = G. When FG,B̂n is not a consistent estimator of B̂n. However, under Assumption 3, TnB̂n =(TnB̂n)(B̂nB̂n)dpξ1ξ2=F1(U[0,1])implying that TnB̂n  mimics the distribution of TnB̂n. This suffices for proving the asymptotic validity of the double bootstrap modified p-value, p˜n=Ĥn(p̂n), as proved next.

Theorem 3.2.

Under Assumptions 1, 2, and 3, it holds that p˜n=Ĥn(p̂n)dU[0,1].

Proof.

To prove this result, recall that Ĥn(u)=P(p̂nu) and P(p̂nu)H(u)=F(G1(u)) uniformly in uR, since H is a continuous distribution function by Assumptions 1 and 2. Thus, we have that p̂n=P(TnTn)=P(TnB̂nTnB̂n)=G(TnB̂n)+op(1),by Assumption 3(i),=G(F1(U[0,1]))+op(1),by Assumption 3(ii),where G(F1(U[0,1])) is a random variable whose distribution function is H. Hence, supuR|Ĥn(u)H(u)|=op(1).

Since H(p̂n)dU[0,1], we can conclude that p˜n=Ĥn(p̂n)dU[0,1]. □

Theorem 3.2 shows that prepivoting the standard bootstrap p-value p̂n by applying the mapping Ĥn transforms it into an asymptotically uniformly distributed random variable. This result holds under Assumptions 1, 2, and 3, independently of whether G = F or not. When G = F then p̂ndU[0,1] (as implied by Theorem 3.1). In this case, the prepivoting approach is not necessary to obtain a first-order asymptotically valid test, although it might help further reducing the size distortion of the test. This corresponds to the setting of Beran (Citation1987, Citation1988), where prepivoting was proposed as a way of reducing the level distortions of confidence intervals. When GF then p̂n is not asymptotically uniform and a standard bootstrap test based on p̂n is asymptotically invalid, as shown in Theorem 3.1. In this case, prepivoting transforms an asymptotically invalid bootstrap p-value into one that is asymptotically valid. This setting was not considered by Beran (Citation1987, Citation1988) and is new to our article.

3.3 Power of Tests

In this section we explicitly consider a testing situation. Suppose we are interested in testing H0:θ=θ¯ against H1:θ<θ¯. Specifically, defining Tn(θ):=g(n)(θ̂nθ), we consider the test statistic Tn(θ¯). The corresponding bootstrap p-value is p̂n(θ¯) with p̂n(θ):=P(TnTn(θ)). When the null hypothesis is true, that is, when θ¯=θ0 with θ0 denoting the true value, we find Tn(θ¯)=Tn(θ0)=Tn and p̂n(θ¯)=p̂n(θ0)=p̂n, where Tn and p̂n are as defined previously. If Assumptions 1 and 2 hold under the null, Theorem 3.1 and Corollary 3.1 imply that tests based on H(p̂n(θ¯)) have correct asymptotic size, where H continues to denote the asymptotic cdf of p̂n.

To analyze power, we consider θ0=θ¯+an for some deterministic sequence an. Then an = 0 under the null hypothesis, whereas an=a<0 corresponds to a fixed alternative and an=a/g(n) for a < 0 corresponds to a local alternative. Thus, we define πn:=g(n)(θ0θ¯)=g(n)an so that Tn(θ¯)=Tn+πn.

Theorem 3.3.

Suppose Assumptions 1 and 2 hold. (i) If πnπ then H(p̂n(θ¯))dF(F1(U[0,1])+π). (ii) If πn then P(H(p̂n(θ¯))α)1 for any nominal level α>0.

Proof.

As in the proof of Theorem 3.1 we have, by Assumption 2(i), p̂n(θ¯)=P(TnTn(θ¯))=P(TnB̂nTnB̂n+πn)=G(TnB̂n+πn)+op(1).

If πnπ then p̂n(θ¯)dG(F1(U[0,1])+π) by Assumption 2(ii), so that H(p̂n(θ¯))dH(G(F1(U[0,1])+π))=F(F1(U[0,1])+π)by definition of H(u). If πn then p̂n(θ¯)p0 because TnB̂n=Op(1) by Assumption 2(ii), so that H(p̂n(θ¯))pH(0)=0 and P(H(p̂n(θ¯))α)1 for any α>0. □

It follows from Theorem 3.3(ii) that a left-tailed test that rejects for small values of H(p̂n(θ¯)) is consistent. Furthermore, it follows from Theorem 3.3(i) that such a test has nontrivial asymptotic local power against π<0. Specifically, the asymptotic local power against π is given by P(H(p̂n(θ¯))α)F(F1(α)π). Interestingly, this only depends on F and not on G. As above, to implement the modified p-value, H(p̂n(θ¯)), in practice, we would need a (uniformly) consistent estimator of H, that is, the asymptotic distribution of the bootstrap p-value when the null hypothesis is true. This could be either the plug-in or double bootstrap estimators, as discussed in Sections 3.2.1 and 3.2.2.

Note that Assumption 2 is still assumed to hold in Theorem 3.3. That is, the bootstrap statistic Tn is assumed to have the same asymptotic behavior under the null and under the alternative. This is commonly the case when the bootstrap algorithm does not impose the null hypothesis when generating the bootstrap data.

3.4 Bootstrap p-value based on TnB̂n

The double bootstrap modified p-value p˜n depends only on the statistic Tn and its bootstrap analogues Tn and Tn. It does not involve computing explicitly B̂n or B̂n, but in some applications it can be computationally costly as it requires two levels of resampling. As it turns out, p˜n is asymptotically equivalent to a single-level bootstrap p-value that is based on bootstrapping the statistic TnB̂n, as we show next.

By definition, the double bootstrap modified p-value is given by p˜n:=P(p̂np̂n), where p̂n:=P(TnTn)=P(TnB̂nTnB̂n)=G(TnB̂n)+op(1),in probability, given Assumption 3. Similarly, under Assumptions 1 and 2, p̂n:=P(TnTn)=P(TnB̂nTnB̂n)=G(TnB̂n)+op(1).

It follows that p˜n:=P(p̂np̂n)=P(G(TnB̂n)G(TnB̂n))+op(1)=P(TnB̂nTnB̂n)+op(1)because G is continuous. We summarize this result in the following corollary.

Corollary 3.3.

Under Assumptions 1, 2, and 3, p˜n=P(TnB̂nTnB̂n)+op(1).

Theorem 3.2 shows that p˜ndU[0,1] and hence is asymptotically valid. In view of this, Corollary 3.3 shows that removing B̂n from Tn and computing a bootstrap p-value based on the new statistic, TnB̂n, also solves the invalidity problem of the standard bootstrap p-value, p̂n=P(TnTn). Note that we do not require ξ2=0, that is, B̂nBn and B̂nB̂n do not need to converge to zero.

When B̂n and B̂n are easy to compute, for example, when they are available analytically as functions of Dn and Dn, respectively, Corollary 3.3 is useful as it avoids implementing a double bootstrap. When this is not the case, that is, when deriving B̂n and B̂n explicitly is cumbersome or impossible, we may be able to estimate B̂n from the bootstrap and B̂n from a double bootstrap. Corollary 3.3 then shows that the double bootstrap modified p-value p˜n is a convenient alternative since it depends only on Tn, Tn, and Tn. It is important to note that none of these approaches requires the consistency of B̂n and B̂n.

3.5 A More General Set of High-Level Conditions

We conclude this section by providing an alternative set of high-level conditions that cover bootstrap methods for which TnB̂n has a different limiting distribution than TnBn. This may happen, for example, for the pairs bootstrap; see Section 2.1 and Remark 3.6.

Assumption 4.

Assumption 2 holds with part (i) replaced by (i) TnB̂ndpζ1, where ζ1 is centered at zero and the cdf J(u)=P(ζ1u) is continuous and strictly increasing over its support.

Under Assumption 4, TnB̂n does not replicate the distribution of TnBn. This is to be understood in the sense that there does not exist a P-measurable term B̂n such that TnB̂n has the same asymptotic distribution as TnBn.

An important generalization provided by Assumption 4 compared with Assumption 2 is to allow for bootstrap methods where the “centering term,” say Bn, depends on the bootstrap data. That is, to allow cases where there is a random (with respect to P, that is, depending on the bootstrap data) term Bn such that TnBndpξ1 and hence has the same asymptotic distribution as TnBn. Clearly, this violates Assumption 2 unless BnB̂npp0 (as in the ridge regression in Section 2.2). However, letting ζ1 be such that BnB̂ndpζ1ξ1, then Assumption 4 covers the former case.

Remark 3.6.

A leading example where TnBndpξ1 and hence has the same asymptotic distribution as TnBn is the pairs bootstrap as in Section 2.1 for the model averaging example. We study this case in more detail in Section 4.1.

The asymptotic distribution of the bootstrap p-value under Assumption 4 is given in the following theorem. The proof is identical to that of Theorem 3.1, with G replaced by J, and hence omitted.

Theorem 3.4.

If Assumptions 1 and 4 hold then p̂ndJ(F1(U[0,1])).

Theorem 3.4 implies that now P(p̂nu)P(J(F1(U[0,1]))u)=F(J1(u))=:H(u). Clearly, a plug-in approach to estimating this H(u) based on G as described in Section 3.2.1 would be invalid because GJ in general. However, it follows straightforwardly by the same arguments as applied in Section 3.2.1 that a plug-in approach based on J will deliver an asymptotically valid plug-in modified p-value.

To implement an asymptotically valid double bootstrap modified p-value we consider the following high-level condition.

Assumption 5.

Assumption 3 holds with part (i) replaced by (i) TnB̂ndpζ1, in probability, where ζ1 is defined in Assumption 4.

Under Assumption 5, the second-level bootstrap statistic, TnB̂n, replicates the distribution of the first-level statistic, TnB̂n. Thus, the second-level bootstrap p-value is p̂n:=P(TnTn)=P(TnB̂nTnB̂n)=J(TnB̂n)+op(1)dpJ(ξ1ξ2)=J(F1(U[0,1]))under Assumption 5. Hence, the second-level bootstrap p-value has the same asymptotic distribution as the original bootstrap p-value. It follows that the double bootstrap modified p-value, p˜n:=Ĥn(p̂n)=P(p̂np̂n), is asymptotically valid, which is stated next. The proof is essentially identical to that of Theorem 3.2 and hence omitted.

Theorem 3.5.

Under Assumptions 1, 4, and 5, it holds that p˜n=Ĥn(p̂n)dU[0,1].

Remark 3.7.

Consider again the case with a random bootstrap centering term in Remark 3.6, where BnB̂ndpζ1ξ1 such that TnBndpξ1. Within this setup, we can consider double bootstrap methods such that, for a random (with respect to P) term Bn we have TnBndpξ1, in probability. Thus, the asymptotic distribution of the second-level bootstrap statistic mimics that of the first-level statistic. When Bn and ζ1 are such that BnB̂ndpζ1ξ1, in probability, then Assumption 5 is satisfied. As in Remark 3.6 this setup allows us to cover the pairs bootstrap.

4 Examples Continued

In this section we revisit our three leading examples from Section 2, where we argued that standard boostrap inference is invalid due to the presence of bias. In this section we show how to apply our general theory in each example. Again, we refer to Appendix B for detailed derivations.

4.1 Inference after Model Averaging

Fixed regressor bootstrap. Extending the arguments in Section 2.1, we obtain the following result.

Lemma 4.1.

Under regularity conditions stated in Appendix B.1, Assumptions 1 and 2 are satisfied with (ξ1,ξ2)N(0,V), where V:=(vij),i,j=1,2, is positive definite and continuous in ω, σ2, and ΣWW:=*plimSWW.

By Lemma 4.1, the conditions of Theorem 3.1 hold with G(u)=Φ(u/v11) and F(u)=Φ(u/vd), where vd2=v11+v222v12>0. Then Theorem 3.1 implies that the standard bootstrap p-value satisfies p̂ndΦ(mΦ1(U[0,1])) with m2:=vd2/v2. Because ω is known and σ2,ΣWW are easily estimated, a consistent estimator m̂npm is available, and the plug-in approach in Corollary 3.2 can be implemented by considering the modified p-value, p˜n=Φ(m̂n1Φ1(p̂n)). Inspection of the proofs shows that our modified bootstrap approach is asymptotically valid whether δ is fixed or local-to-zero. In the former case, Bn is Op(n1/2) rather than Op(1), implying that Bn diverges in probability and β˜n is not even consistent for β. Despite this, the modified bootstrap p-value is asymptotically valid.

Alternatively, we can implement the double bootstrap as in Section 3.2.2. Specifically, let y=xβ̂n+Zδ̂n+ε,where ε|{Dn,Dn}N(0,σ̂n2In),(β̂n,δ̂n,σ̂n2)  is the OLS estimator obtained from the full model estimated on the first-level bootstrap data, and Dn={y,W}. The double bootstrap statistic is Tn:=n1/2(β˜nβ̂n), where β˜n:=m=1Mωmβ˜m,n with β˜m,n:=Sxx.Zm1Sxy.Zm defined as the double bootstrap OLS estimator from the mth model. The double bootstrap modified p-value is then p˜n=P(p̂np̂n) with p̂n=P(TnTn).

Lemma 4.2.

Under the conditions of Lemma 4.1, Assumption 3 holds with B̂n:=Qnn1/2δ̂n.

Lemma 4.2 shows that Assumption 3 is verified in this example. The asymptotic validity of the double bootstrap modified p-value now follows from Lemmas 4.1 and 4.2 and Theorem 3.2.

Pairs bootstrap. For the pairs bootstrap we verify the high-level conditions in Section 3.5. To simplify the discussion we consider the case with scalar zt in (2.1) and where we “average” over only one model (M = 1), which is the simplest model in which zt is omitted from the regression. That is, we estimate β by regression of y on x, that is, β˜n:=Sxx1Sxy. In this special case, TnBndN(0,v2) with v2:=σ2Σxx1 and Bn:=Sxx1Sxzn1/2δ.

Lemma 4.3.

Under regularity conditions stated in Appendix B.1, it holds that TnB̂ndpN(0,v2+κ2), where B̂n:=Sxx1Sxzn1/2δ̂n and κ2:=dr(δ)Σrdr(δ) with dr(δ):=δ(Σxx1,Σxx2Σxz).

Notice that, in contrast to the FRB, the asymptotic variance of Tn fails to replicate that of Tn because of the term κ2>0. This implies that the methodology developed in Theorem 3.1 and its corollaries no longer applies. Instead, we can apply the theory of Section 3.5. In particular, Lemma 4.3 shows that Assumption 4(i) holds in this case with ζ1N(0,v2+κ2). Lemma 4.3 also shows that B̂n is the same for the pairs bootstrap and the FRB, such that Lemma 4.1 shows that Assumptions 1 and 2(ii) are verified. This implies that Theorem 3.4 holds for this example. Using similar arguments, it can be shown that Assumption 5 also holds for this example, which implies that the double bootstrap p-values are asymptotically uniformly distributed.

Under local alternatives of the form β0=β¯+an1/2, where β¯ is the value under the null (Section 3.3), the asymptotic local power function for the modified p-value is given by Φ(Φ1(α)a/vd); see Theorem 3.3. It is not difficult to verify that this is the same power function as that obtained from a test based directly on β̂n from the full model (2.1).

4.2 Ridge Regression

To complete the example in Section 2.2, we can proceed as in the previous example.

Lemma 4.4.

Under the null hypothesis and the regularity conditions stated in Appendix B.2, Assumptions 1 and 2 are satisfied with (ξ1,ξ2)N(0,V), where V:=(vij),i,j=1,2, is positive definite and continuous in c0, σ2, and Σxx.

As in Section 4.1, Lemma 4.4 and Theorem 3.1 imply that the standard bootstrap p-value satisfies p̂ndΦ(mΦ1(U[0,1])), where we now have m2=(gΣ˜xx1ΣxxΣ˜xx1g)1gΣxx1g. Note that this result holds irrespectively of θ being fixed or local to zero. Thus, the bootstrap is invalid unless c0=0 which implies m = 1. For the plug-in method, a simple consistent estimator of m is given by m̂n2:=(gS˜xx1SxxS˜xx1g)1gSxx1g, and inference based on the plug-in modified p-value p˜n=Φ(m̂n1Φ1(p̂n)) is then asymptotically valid by Corollary 3.2.

To implement the double bootstrap method, we can draw the double bootstrap sample {yt,xt;t=1,,n} as iid from {yt,xt;t=1,,n}. Accordingly, the second-level bootstrap ridge estimator is θ˜n:=S˜xx1Sxy with associated test statistic Tn:=n1/2g(θ˜nθ̂n), which is centered at the first-level bootstrap OLS estimator, θ̂n. It is straightforward to show that, without additional conditions, Assumption 3 holds.

Lemma 4.5.

Under the conditions of Lemma 4.4, Assumption 3 holds with B̂n:=cnn1/2gS˜xx1θ̂n.

Validity of the double bootstrap modified p-value p˜n=P(p̂np̂n) now follows by application of Theorem 3.2.

4.3 Nonparametric Regression

Again, we complete the example in Section 2.3 by proceeding as in the previous examples.

Lemma 4.6.

Under regularity conditions stated in Appendix B.3, Assumptions 1 and 2 are satisfied with (ξ1,ξ2)N(0,V), where V:=(vij),i,j=1,2, is positive definite and continuous in σ2 and the kernel function.

As before, Lemma 4.6 and Theorem 3.1 imply that the standard bootstrap p-value satisfies p̂ndΦ(mΦ1(U[0,1])), where we now have m2:=4+(K2(u)du)1((K(su)K(s)ds)2du4K(u)K(us)K(s)dsdu). Thus, in this example, m need not be estimated because it is observed once K is chosen. Therefore, valid inference is feasible with the modified p-value p˜n=H(p̂n)=Φ(m1Φ1(p̂n)); see Corollary 3.1.

We can also apply a double bootstrap modification. Let yt=β̂h(xt)+εt,t=1,,n, where εt|{Dn,Dn}  iid N(0,σ̂n2) with Dn:={yt;t=1,,n} and σ̂n2 denoting the residual variance from the first-level bootstrap data. The double bootstrap analogue of Tn is Tn:=(nh)1/2(β̂h(x)β̂h(x)), where β̂h(x):=(nh)1t=1nktyt. This can be decomposed as Tn=ξ1,n+B̂n, where B̂n:=(nh)1/2((nh)1t=1nktβ̂h(xt)β̂h(x)). Unfortunately, although ξ1,n satisfies Assumption 3(i), B̂n does not satisfy Assumption 3(ii). The reason is that B̂nB̂n=ξ2,n+B̂2,nB̂n, where ξ2,n satisfies Assumption 3(ii), but B̂2,n:=(nh)1t=1nktB̂n(xt) is a smoothed version of B̂n (evaluated at xt) and although B̂2,nB̂n is mean zero it is not op(1). However, B̂2,nB̂n is observed, so this is easily corrected by defining T¯n:=Tn(B̂2,nB̂n). Then we have the following result.

Lemma 4.7.

Under the conditions of Lemma 4.6, Assumption 3 holds with Tn and B̂n replaced by T¯n and B¯n:=B̂n(B̂2,nB̂n), respectively.

The validity of the double bootstrap modified p-value p˜n:=P(p̂np̂n), where p̂n:=P(T¯nTn), follows from Lemma 4.7 and Theorem 3.2. This in turn implies that confidence intervals based on the double bootstrap are asymptotically valid; see also Remark 3.4. We note that Hall and Horowitz (Citation2013) also proposed, without theory, a version of their calibration method based on the double bootstrap. Our double bootstrap-based method for confidence intervals corresponds to their steps 1–5, and where we need a correction they have instead a step 6 in which they average over a grid of x.

Finally, under local alternatives of the form β0(x)=β¯+an2/5, where β¯ is the value under the null (Section 3.3), the asymptotic local power function for the modified p-value is given by Φ(Φ1(α)a/vd); see Theorem 3.3. Alternatively, we could consider a “bias-free” test based on undersmoothing; that is, using a bandwidth h satisfying nh50 such that Bn0 and inference can be based on quantiles of ξ1N(0,v112). In contrast to our procedure, however, such a test has only trivial power against β¯+an2/5 because (nh)1/2an2/50.

5 Concluding Remarks

In this article, we have shown that in statistical problems involving bias terms that cannot be estimated, the bootstrap can be modified to provide asymptotically valid inference. Intuitively, the main idea is the following: in some important cases, the bootstrap can be used to “debias” a statistic whose bias is non-negligible, but when doing so additional “noise” is injected. This additional noise does not vanish because the bias cannot be consistently estimated, but it can be handled either by a “plug-in” method or by an additional (i.e., double) bootstrap layer. Specifically, our solution is simple and involves (i) focusing on the bootstrap p-value; (ii) estimating its asymptotic distribution; (iii) mapping the original (invalid) p-value into a new (valid) p-value using the prepivoting approach. These steps are easy to implement in practice and we provide sufficient conditions for asymptotic validity of the associated tests and confidence intervals.

Our results can be generalized in several directions. For instance, there is a growing literature where inference on a parameter of interest is combined with some auxiliary information in the form of a bound on the bias of the estimator in question. These bounds appear, for example, in Oster (Citation2019) and Li and Müller (Citation2021). It is of interest to investigate how our analysis can be extended in order to incorporate such bounds. Other possible extensions include non-ergodic problems, large-dimensional models, and multivariate estimators or statistics. All these extensions are left for future research.

Supplementary Materials

The supplemental material contains two appendices. Appendix A describes in detail the conditions and results of the article under the special case of asymptotically Gaussian statistics. Appendix B contains details and proofs for the three examples in the article, as well as two additional examples. Additional references are included at the end of the supplement.

Supplemental material

JASA-TM-2022-0545-supplement.zip

Download Zip (446.1 KB)

JASA-TM-2022-0545-ACC_form.pdf

Download PDF (387.5 KB)

Acknowledgments

We thank Federico Bandi, Matias Cattaneo, Christian Gourieroux, Philip Heiler, Michael Jansson, Anders Kock, Damian Kozbur, Marcelo Moreira, David Preinerstorfer, Mikkel Sølvsten, Luke Taylor, Michael Wolf, and participants at the AiE Conference in Honor of Joon Y. Park, 2022 Conference on Econometrics and Business Analytics (CEBA), 2023 Conference on Robust Econometric Methods in Financial Econometrics, 2022 EC2 conference, 2nd “High Voltage Econometrics” workshop, 2023 IAAE Conference, 3rd Italian Congress of Econometrics and Empirical Economics, 3rd Italian Meeting on Probability and Mathematical Statistics, 19th School of Time Series and Econometrics, Brazilian Statistical Association, 2023 Société Canadienne de Sciences Économiques, 2022 Virtual Time Series Seminars, as well as seminar participants at Aarhus University, CREST, FGV - Rio, FGV - São Paulo, Ludwig Maximilian University of Munich, Queen Mary University, Singapore Management University, UFRGS, University of the Balearic Islands, University of Oxford, University of Pittsburgh, University of Victoria, York University, for useful comments and feedback.

6 Disclosure Statement

The authors report there are no competing interests to declare.

Additional information

Funding

Cavaliere thanks the Italian Ministry of University and Research (PRIN 2017 Grant 2017TA7TYC) for financial support. Gonçalves thanks the Natural Sciences and Engineering Research Council of Canada for financial support (NSERC grant number RGPIN-2021-02663). Nielsen thanks the Danish National Research Foundation for financial support (DNRF Chair grant number DNRF154).

Notes

1 Note that we write TnB̂ndpξ1 to mean that TnB̂n has (conditionally on Dn) the same asymptotic distribution function as the random variable ξ1. We could alternatively write that TnB̂ndpξ1 and TnBndξ1 where ξ1 and ξ1 are two independent copies of the same distribution, that is, P(ξ1u)=P(ξ1u). We do not make this distinction because we care only about distributional results, but it should be kept in mind.

2 The same result follows in terms of weak convergence in distribution of Tn|Dn. Specifically, because Tn=(TnB̂n)+(B̂nBn)+Bn, where TnB̂ndpξ1 and (jointly) B̂nBndξ2 with ξ1ξ1 independent of ξ2, we have that Tn|Dnw(B+ξ1+ξ2)|ξ2.

References

  • Beran, R. (1987), “Prepivoting to Reduce Level Error in Confidence Sets,” Biometrika, 74, 457–468. DOI: 10.1093/biomet/74.3.457.
  • ——- (1988), “Prepivoting Test Statistics: A Bootstrap View of Asymptotic Refinements,” Journal of the American Statistical Association, 83, 687–97.
  • Calonico, S., Cattaneo, M. D., and Farrell, M. H. (2018), “On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference,” Journal of the American Statistical Association, 113, 767–779. DOI: 10.1080/01621459.2017.1285776.
  • Calonico, S., Cattaneo, M. D., and Titiunik, R. (2014), “Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs,” Econometrica, 82, 2295–2326. DOI: 10.3982/ECTA11757.
  • Cattaneo, M. D., and Jansson, M. (2018), “Kernel-based Semiparametric Estimators: Small Bandwidth Asymptotics and Bootstrap Consistency,” Econometrica, 86, 955–995. DOI: 10.3982/ECTA12701.
  • ——- (2022), “Average Density Estimators: Efficiency and Bootstrap Consistency,” Econometric Theory, 38, 1140–1174.
  • Cattaneo, M. D., Jansson, M., and Ma, X. (2019), “Two-Step Estimation and Inference with Possibly Many Included Covariates,” Review of Economic Studies, 86, 1095–1122. DOI: 10.1093/restud/rdy053.
  • Cavaliere, G., and Georgiev, I. (2020), “Inference Under Random Limit Bootstrap Measures,” Econometrica, 88, 2547–2574. DOI: 10.3982/ECTA16557.
  • Chatterjee, A., and Lahiri, S. N. (2010), “Asymptotic Properties of the Residual Bootstrap for Lasso Estimators,” Proceedings of the American Mathematical Society, 138, 4497–4509. DOI: 10.1090/S0002-9939-2010-10474-4.
  • Chatterjee, A., and Lahiri, S. N. (2011), “Bootstrapping Lasso Estimators,” Journal of the American Statistical Association, 106, 608–625. DOI: 10.1198/jasa.2011.tm10159.
  • Efron, B. (1983), “Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation,” Journal of American Statistical Association, 78, 316–331. DOI: 10.1080/01621459.1983.10477973.
  • Eubank, R. L., and Speckman, P. L. (1993), “Confidence Bands in Nonparametric Regression,” Journal of the American Statistical Association, 88, 1287–1301. DOI: 10.1080/01621459.1993.10476410.
  • Fu, W., and Knight, K. (2000), “Asymptotics for Lasso-Type Estimators,” Annals of Statistics, 28, 1356–1378.
  • Hall, P. (1986), “On the Bootstrap and Confidence Intervals,” Annals of Statistics, 14, 1431–1452.
  • ——- (1992), The Bootstrap and Edgeworth Expansion, Berlin: Springer-Verlag.
  • Hall, P., Horowitz, J. (2013), “A Simple Bootstrap Method for Constructing Nonparametric Confidence Bands for Functions,” Annals of Statistics, 41, 1892–1921.
  • Hansen, B. E. (2007), “Least Squares Model Averaging,” Econometrica, 75, 1175–1189. DOI: 10.1111/j.1468-0262.2007.00785.x.
  • Härdle, W., and Bowman, A. W. (1988), “Bootstrapping in Nonparametric Regression: Local Adaptive Smoothing and Confidence Bands,” Journal of the American Statistical Association, 83, 102–110. DOI: 10.1080/01621459.1988.10478572.
  • Härdle, W., and Marron, J. S. (1991), “Bootstrap Simultaneous Error Bars for Nonparametric Regression,” Annals of Statistics, 19, 778–796.
  • Hjort, N., and Claeskens, G. (2003), “Frequentist Model Average Estimators,” Journal of the American Statistical Association, 98, 879–899. DOI: 10.1198/016214503000000828.
  • Horowitz, J. L. (2001), “The Bootstrap,” in Handbook of Econometrics (Vol. 5), eds. J. J. Heckman, and E. Leamer, pp. 3159–3228, Elsevier: Amsterdam.
  • Hounyo, U., and Lahiri, K. (2023), “Estimating the Variance of a Combined Forecast: Bootstrap-based Approach,” Journal of Econometrics, 232, 445–468. DOI: 10.1016/j.jeconom.2021.09.011.
  • Li, C., and Muller, U. (2021), “Linear Regression with Many Controls of Limited Explanatory Power,” Quantitative Economics, 12, 405–442. DOI: 10.3982/QE1577.
  • Liu, C.-A. (2015), “Distribution Theory of the Least Squares Averaging Estimator,” Journal of Econometrics, 186, 142–159. DOI: 10.1016/j.jeconom.2014.07.002.
  • Oster, E. (2019), “Unobservable Selection and Coefficient Stability: Theory and Evidence,” Journal of Business & Economic Statistics, 37, 187–204. DOI: 10.1080/07350015.2016.1227711.
  • Shao, X., and Politis, D. N. (2013), “Fixed b Subsampling and the Block Bootstrap: Improved Confidence Sets based on p-value Calibration,” Journal of the Royal Statistical Society, Series B, 75, 161–184. DOI: 10.1111/j.1467-9868.2012.01037.x.