176
Views
0
CrossRef citations to date
0
Altmetric
Articles

A resampling approach to estimation of the linking variance in the Fay–Herriot model

ORCID Icon
Pages 170-177 | Received 27 Dec 2018, Accepted 30 Sep 2019, Published online: 14 Oct 2019

ABSTRACT

In the Fay–Herriot model, we consider estimators of the linking variance obtained using different types of resampling schemes. The usefulness of this approach is that even when the estimator from the original data falls below zero or any other specified threshold, several of the resamples can potentially yield values above the threshold. We establish asymptotic consistency of the resampling-based estimator of the linking variance for a wide variety of resampling schemes and show the efficacy of using the proposed approach in numeric examples.

1. Introduction

A typical survey on any aspect of human behaviour or natural phenomena is limited in scope by feasibility, ethics and cost constraints. Consequently, data are not always obtained in as fine a scale as required by stakeholders. In such cases, small area models are often useful for gathering resources across various domains and variables to obtain estimators and inferences with high precision. Detailed description of small area methods, principles and procedures may be found in the book (Rao & Molina, Citation2015), and in the papers (Chatterjee, Lahiri, & Li, Citation2008; Das, Jiang, & Rao, Citation2004; Datta, Hall, & Mandal, Citation2011; Datta, Rao, & Smith, Citation2005; Jiang & Lahiri, Citation2006; Jiang, Lahiri, & Wan, Citation2002; Li & Lahiri, Citation2010; Molina, Rao, & Datta, Citation2015; Pfeffermann, Citation2013; Pfeffermann & Glickman, Citation2004; Rao, Citation2015; Yoshimori & Lahiri,Citation2014aCitation2014b).

In small area studies, arguably the most popular choice of a model for the observed data is the Fay–Herriot model (Fay & Herriot, Citation1979). This model is described by the following two-layered framework:

  1. Sampling model: Conditional on the unknown and unobserved area-level effects θ=(θ1,,θn)T, the sampled and observed data Yn=(Y1,,Yn)T follows a n-variate Normal distribution with mean θ and covariance matrix D with known diagonal entries Di>0 and off-diagonal entries 0. This layer models the sampling level variability and distribution in the observations, conditional on the inherent characteristics θ of the different small areas.

  2. Linking model: The unobserved area-level effects θ follows a n-variate Normal distribution with mean Xβ for a known and non-random n×p matrix X and unknown but fixed vector βRp. The covariance matrix is ψIn, where the matrix In is the n dimensional identity matrix and ψ>0 is an unknown positive-valued constant. This second layer of the Fay–Herriot model links the various small areas together by requiring that all the small areas share a common set of regression parameters βRp and a common variance component. We assume henceforth that ψ>ε for some small known number ϵ.

Suppose ENn(0,D) and UNn(0,ψIn) are mutually independent n-dimensional Gaussian random variables, respectively, denoting the errors and random effects. Then we may express the Fay–Herriot model as (1) Yn=Xβ+U+ENn(Xβ,D+ψIn).(1) In the above Fay–Herriot model, the unknown parameters are the regression parameters βRp and a common linking variance component ψ>0. The distribution of the area-level effects θ, conditional on the data Yn, is the primary quantity of interest in the Fay–Herriot model, and this distribution depends on β and ψ. One traditional approach is to estimate these parameters from data, and then use the estimators in subsequent prediction of θ, conditional on the data Yn.

To assess the quality of prediction in a small area context, we may compute the mean squared prediction error or obtain a prediction interval. The typically Monte Carlo computation-driven technique of resampling is quite popular method for obtaining such intervals and errors bounds and a variety of other purposes, see for example, Chatterjee et al. (Citation2008), Datta et al. (Citation2011), Kubokawa and Nagashima (Citation2012). While most of these papers either use the parametric bootstrap or Efron's original bootstrap (Efron, Citation1979), other approaches to resampling are also viable, as discussed later in this paper.

One of the intermediate products in all such resampling computations are multiple copies of the estimators of β and ψ obtained from the resamples. The objective of this paper is to assess the quality of such resampling-driven estimators of the original parameters of the data, and we concentrate only on the linking variance parameter ψ here, since it is more critical in subsequent analysis. The immediate purpose of this study is to explore the possibility of using such resampling-based estimators of ψ in case the estimator from the original data takes too low a value.

We present new estimators of ψ, based on the resampling data in a Fay–Herriot model in Section 2, and related theoretical results in Section 3. Some simulation studies are presented in Section 4. Some concluding remarks are included in Section 5.

In the above description of the Fay–Herriot model and in the rest of this paper, we consider all vectors to be column vectors, and for any vector or matrix A, the notation AT denotes its transpose. For any vector a, the notation |a| stands for its Euclidean norm. The notation ai will denote the ith element of vector a, similarly Aij will denote the (i,j)th element of matrix A. The notation tr(A) denotes the trace of a matrix A, IA is the indicator function of measurable set A, and P,E,V are symbols denoting probability, expectation and variance generically. Other notations will be introduced and described as they arise.

We assume that p is fixed and p<n holds, and X is full column rank. We use the notation xi for the ith row of the matrix X, and Px for the projection matrix onto the columns of X. In fact, all our results are valid even when p increases with n as long as p2/n0, but the proofs require some additional technicalities. We assume that n1XTX tends towards a positive definite matrix. We also assume that max1inPxii=O(p/n), that is, the maximum of the diagonal entries of the projection matrix Px is of the order of n1p. These conditions are very traditional and standard. We use the notation Dx for the diagonal matrix that has the ith diagonal element as Pxii, ie., Dx and Px share the same diagonal elements but Dx has all off-diagonal elements to be zero. Additionally, we assume that the sampling variances Di's are bounded above, and away from zero.

2. The resampling-based framework

In the Fay–Herriot model, we implement resampling by associating a non-negative random weight, generically referred to as a resampling weight Wi, with the ith observed data point (Yi,xi,Di), for i=1,,n. We assume that these weights are exchangeable and independent of the data, and EW1=μ>0 and VW1=γ2μ2>0, where γ>0. Henceforth we use the notation Er to denote resampling-expectation, that is, expectation with respect to the distribution of W conditional on the observed data.

Define Wi=γ1μ1(Wiμ), for the centred and scaled version of these resampling weights. In minor abuses of notation, we also define W as a diagonal matrix with the ith diagonal element being Wi, and similarly W is a diagonal matrix whose ith diagonal element is Wi. Thus we may write W=μIn+γμW.

Let δ>0 be a pre-specified constant, and An be the set on which the smallest eigenvalue of XTWX is higher than δ. Under standard regularity conditions the probability of the set AnC is exponentially small, see for example, Chatterjee and Bose (Citation2000). Nevertheless, it is useful to consider the cases An and AnC separately, since differences arise occasionally with specific choices of W in numeric computations with finite samples.

Using the resampling weights along with the observed data, the estimator for the regression slope parameters are βˆnr=(XTWX)1XTWYnon the set An,βˆnotherwise. In the above, the second subscript is to denote that this estimator is obtained in the resampling scenario. In the following we describe the case for the set An only for brevity, the asymptotically negligible case AnC can be addressed using routine algebra. The above leads to the resampling-based fitted values Yˆnr=Xβˆnr, and the resulting residuals Rnr=YnYˆnr=(InX(XTWX)1XTW)Yn=(InX(XTWX)1XTW)(E+U).

In the rest of the paper, we study two different resampling-driven estimators of ψ. Let Tnr=Er|Rnr|2tr(In+γ2Dx)(InPx)D(InPx)tr(In+γ2Dx)(InPx)

The resampling-based Prasad–Rao estimator for ψ is defined as (2) ψˆnr=TnrI{Tnr>ε}.(2)

The above estimator is based on Prasad and Rao (Citation1990), where we fit an ordinary least squares regression on the original data Yn using the covariates X, thus obtaining βˆn=(XTX)1XTYn. Based on the above estimator for β, define the vector of residuals R=(InPx)Yn. We use the notation Ri for the ith element of R. The PR estimator for ψis given by (3) ψˆn=(np)1|R|2i=1nDi+i=1n(PxD)i.(3)

The second resampling-based estimator of ψ that we study is a variant of the above. Fix a small ϵ0>0 that is smaller than the lower bound on ψ. Define the set Anr={n1|Rnr|2n1tr(In+γ2Dx)(InPx)D×(InPx)+ϵ0/2}. The alternative resampling-based estimator that we propose is ψ~nr=Er|Rnr|2IAnrtr(In+γ2Dx)(InPx)D(InPx)tr(In+γ2Dx)(InPx). Note that on the set Anr, |Rnr|2 is greater than tr(In+γ2Dx)(InPx)D(InPx), consequently ψ~nr>0 almost surely. Essentially, the cases where the rth resample has n1|Rnr|2n1tr(In+γ2Dx)(InPx)D(InPx)+ϵ0/2 and take an average over the rest of the cases.

This PR estimator is positive almost surely when the intercept is the only covariate but may return a negative value under some circumstances. In fact, an often-encountered challenge in small area studies is to get an estimate of ψ that is positive since standard estimation techniques like the Prasad–Rao method discussed above may result in a potential negative value as an estimate of variance. To address this issue, several techniques have been proposed, see for example Li and Lahiri (Citation2010), Yoshimori and Lahiri (Citation2014a), Hirose and Lahiri (Citation2017), Chatterjee (Citation2018).

Our resampling-based proposed estimators achieve such positive values several times even when ψˆn<ε. We present some numeric examples in Section 4.

Multiple traditional resampling scheme may be realised as special cases of the above framework. Consider a probability experiment where we have n boxes, and n balls, and we allocate balls to boxes randomly and independently, with each ball having the equal probability of 1/n of being allocated to any of the n boxes. The n-dimensional random vector obtained by counting the number of balls in the ith box, for i=1,,n may be considered as an example of (W1,,Wn), with μ=1 and γ=(1n1)1/2. This example is mathematically identical to the famous paired bootstrap of Efron.

If we modify the above probability experiment slightly, and consider the case where we have only m balls, the above framework obtains (W1,,Wn) that corresponds to the m-out-of-n bootstrap (referred to as moon-bootstrap in the rest of this paper). If we consider the case where Wi are independent and identically distributed exponential random variables, we obtain variations of the Bayesian bootstrap of Rubin. Other interesting special cases and additional discussion on using resampling weights may be found in Præstgaard and Wellner (Citation1993), Barbe and Bertail (Citation1995), Chatterjee and Bose (Citation2005) and multiple references therein, and in the related chapter of Shao and Tu (Citation1995). The latter classic text, along with the other classic (Efron & Tibshirani, Citation1993) may be consulted for details about resampling in general.

3. Asymptotic consistency of proposed estimators

We assume that the cross moments on the centred and scaled resampling weights, as in Erj=1JWjkj for small values of J and j=1Jkj, do not grow very fast with n. Precise rate at which these grow are available in Chatterjee (Citation1998), Chatterjee and Bose (Citation2000), Chatterjee and Bose (Citation2005) and in other places, and it can be shown easily that such conditions hold for the standard choices of resampling weights. Naturally, when the resampling weights are independent, and the above cross moments are bounded and are often zero, easily satisfy such requirements. We omit detailed discussion on these conditions. We assume that γ is bounded, which is also satisfied by several resampling schemes, but not by others like the moon-bootstrap. Our theoretical results are valid even when we use a controlled rate of growth of γ with n that is applicable to the moon-bootstrap, and our numeric studies validate this later in this paper. However, we discuss only the case of bounded γ to simplify some of the proofs. We now present our main results below.

Theorem 3.1

Under the conditions stated above, the resampling-based Prasad–Rao estimator is asymptotically consistent, that is ψˆnrψ in probability, as n.

The different resampling-based estimator ψ~nr, constructed by only considering those resamples where |Rnr|2 exceeds a threshold, is consistent as well, apart from being always positive. We present this result in the following Theorem:

Theorem 3.2

Under the conditions stated above, ψ~nrψ in probability, as n.

Proof of Theorem 3.1

We omit several algebraic details below, especially those relating to proving asymptotically negligible terms are indeed negligible. Thus, the proof below is a sketch of the main arguments. Below, we use Rnk for k=1,2,3, as generic notation for negligible remainder terms. Also, we use c and K, with our without subscripts, as generic notation for constants.

Define D=(XTX)1/2XTWX(XTX)1/2. Then XTWX=μ(XTX+γXTWX)=μ(XTX)1/2(Ip+γ(XTX)1/2XTWX×(XTX)1/2)(XTX)1/2=μ(XTX)1/2(Ip+γD)(XTX)1/2. Thus, (XTWX)1=μ1(XTX)1/2(Ip+γD)1×(XTX)1/2. Now notice that (Ip+γD)1=IpγD+γ2D(Ip+γD)1D. Thus (XTWX)1=μ1(XTX)1/2(Ip+γD)1(XTX)1/2=μ1(XTX)1/2{IpγD+γ2D(Ip+γD)1D}(XTX)1/2=μ1[(XTX)1γ(XTX)1XT×WX(XTX)1+γ2(XTX)1(XTWX)(XTX)1/2(Ip+γD)1×(XTX)1/2(XTWX)(XTX)1]. Thus we have X(XTWX)1XT=μ1[PxγPxWPx+γ2Px×WX(XTX)1/2(Ip+γD)1×(XTX)1/2XTWPx]=μ1[PxγPxWPx+γ2Rn1].

Here, we have Rn1=PxWX(XTX)1/2(Ip+γD)1(XTX)1/2XTWPx. Note that for any non-zero cRn ErcTRn1ccTcKcTPxDxPxccTc=O(n2). Thus all eigenvalues of the expectation of Rn1 are O(n2).

This leads us to InX(XTWX)1XTW=Inμ1[PxγPxWPx+γ2Rn1]×{μ(In+γW)}=In[PxγPxWPx+γ2Rn1]{In+γW}=InPx+γPxWPxγ2Rn1γPxW+γ2PxWPxWγ3Rn1W=(InPx)γPxW(InPx)γ2(Rn1+γRn1WPxWPxW)=(InγPxW)(InPx)γ2Rn2, where Rn2=Rn1+γRn1WPxWPxW. We can show that for any non-zero cRn with |c|=1, we have EcTRn2TRn2c=O(n4). This is algebraic, we omit the details.

So that eventually we get Rnr=(InγPxW)(InPx)(U+E)γ2Rn2(U+E).

Hence we can write |Rnr|2=tr{(InγPxW)T(InγPxW)×(InPx)(U+E)(U+E)T(InPx)}2γ2(U+E)TRn2T(InγPxW)×(InPx)(U+E)+γ2(U+E)T×Rn2TRn2(U+E)=tr{(InγPxW)T(InγPxW)×(InPx)(U+E)(U+E)T(InPx)}+γ2Rn3=tr[{InγPxWγWPx+γ2WPxW}×{(InPx)(U+E)(U+E)T(InPx)}]+γ2Rn3. The exact expression of Rn3 is available from the above expansion, we do not re-write it to avoid repetitions.

The resampling-expectation of the above is Er|Rnr|2=trEr[{InγPxWγWPx+γ2WPxW}×{(InPx)(U+E)(U+E)T(InPx)}]+γ2ErRn3=tr[{In+γ2Dx}{(InPx)×(U+E)(U+E)T(InPx)}]+γ2ErRn3=tr[{In+γ2Dx}{(InPx)×(U+E)(U+E)T(InPx)}]+γ2ErRn3=tr[{In+γ2Dx}(InPx)×{D+ψIn+(UUTψIn)+(EETD)+UET+EUT}(InPx)]+γ2ErRn3.

Also notice that tr(In+γ2Dx)(InPx)=tr{InPx+γ2Dxγ2i=1nPxii2}=O(n) with the first term contributing O(n), the second and third terms contributing O(p)=O(1) and the last term contributing O(n1p2)=O(n1).

We can show, using the computations on Rn1 and Rn2 partially presented above, that E(ErRn3)2=O(n). This computation is algebraic, we present a few instances of the main arguments we use for this step.

Consider the term Tn1=tr{In+γ2Dx}(InPx)UET(InPx). We have Tn1=ET(InPx)(In+γ2Dx)(InPx)U=a,b,c=1n(InPx)ab(In+γ2Dx)bb(InPx)bcEaUc=a,i,c=1n(InPx)ai(1+γ2Pxii)(InPx)icEaUc. Thus we have ETn12=Ei=1n(1+γ2Pxii)a=1n(InPx)iaEa×c=1n(InPx)icUc2=Ei,j=1n(1+γ2Pxii)(1+γ2Pxjj)×a1,a2=1n(InPx)ia1(InPx)ja2Ea1Ea2×c1,c2=1n(InPx)ic1(InPx)ic2Uc1Uc2=ψEi,j=1n(1+γ2Pxii)(1+γ2Pxjj)×a=1n(InPx)ia(InPx)jaDa×c=1n(InPx)ic(InPx)jc.

As an example, consider the second component in the last term above. First, let us deal with the case i=j. Then this term becomes c=1n(InPx)ic2=(1Pxii)2+c=1n(i)Pxic2=(1Pxii)2Pxii+c=1nPxic2=(1Pxii)2=O(1).

For the ij case, we have c=1n(InPx)ic(InPx)jc=(1Pxii)Pxij(1Pxjj)Pxij+c=1n(i,j)PxicPxjc=PxijPxij+c=1nPxicPxjc=PxijPxij+Pxij=Pxij Thus, we essentially repeatedly leverage the properties of a projection matrix for the algebraic computations. We omit further algebraic details, they are typically routine.

If we now write Rn4=Er|Rnr|2tr(In+γ2Dx)(InPx)Dtr(In+γ2Dx)(InPx)ψ, the above computations establish that ERn420, as n, thus establishing the result.

Proof of Theorem 3.2

This proof contains several calculations in common with the proof of Theorem 3.1, consequently we only discuss the main ideas, and omit most algebraic details.

Recall from the proof of Theorem 3.1 that n1|Rnr|2=n1tr[{In+γ2Dx}(InPx)×{D+ψIn+(UUTψIn)+(EETD)+UET+EUT}(InPx)]+n1γ2ErRn3=X1+Y1,(say).

For convenience, let us use the notation ν=n1tr(In+γ2Dx)(InPx)D(InPx). We thus have P[AnrC]=P[n1|Rnr|2ν+ϵ0/2]P[n1|Rnr|2ν+ϵ0/2,|Y1|>ϵ0/2]+P[n1|Rnr|2ν+ϵ0/2,|Y1|ϵ0/2]P[|Y1|>ϵ0/2]+P[X1ν+ϵ0].

Our previous computations from the proof of Theorem 3.1 now ensure that E(Y12)=O(n1) as well as V(X1)=O(n1), and hence both the above probabilities tend to zero, we omit some of the algebraic details here. Some additional algebraic details are then required to establish that the expectation of the square of the difference between ψˆnr and ψ~nr is also O(n1), thus concluding the proof.

4. Some simulation studies

To understand the finite sample performance of the proposed estimator of ψ, we conducted a simulation study. In this study, we consider a framework based closely on the studies of Datta et al. (Citation2005). We use n=15 observations, and p=3 covariates, one of which is the intercept term, and the other two covariates are random generated in each replication of the experiment and fixed. We fix ψ=1 and β=(2,3,4)T. We conduct the experiments with two different choices of the sampling variances. In Experiment 1, we fix the Di's to be 2.0, 0.6, 0.5, 0.4, 0.2, each repeated 3 times. In Experiment 2, we fix the Di's to be 20.0,0.6,0.5,0.4,0.2, each repeated 3 times.

We consider three different resampling schemes in this study. First, we consider W to be the multinomial random variable corresponding to randomly assigning n balls in n boxes, each with equal probability. This corresponds to the paired bootstrap of Efron. Next, we consider W to be the multinomial random variable corresponding to randomly assigning m=[n0.8] balls in n boxes, each with equal probability, which corresponds to the moon-bootstrap. Third, we consider each element of W to have an exponential distribution with mean 1, thus realising the Bayesian bootstrap. The resampling Monte Carlo size is fixed at 1000 in all cases. The simulation experiment was replicated 200 times, and the results presented below are based on these 200 replications.

First, we enumerate the percentage of times estimators of ψ took values less than ε=104. This number varies on the various parameters used in the simulation experiment, and can approach even 8090% when we change only a single Di value or a parameter value within reasonable range.

In Experiment 1, it can be seen from Table  that in approximately 5% of the simulated datasets, the Prasad–Rao estimator based on the original data obtained a value less than ϵ. In these 6% of the simulated datasets where the Prasad–Rao estimator took a value less than ϵ, different resampling schemes obtained between 19% and 25% cases where the Prasad–Rao estimator based on the resampling data were above ϵ. In the other 95% cases where the Prasad–Rao estimator from the original data was greater than ϵ, its resampling data-based versions were typically 9399% times greater than ϵ as well. This suggests that using the Prasad–Rao estimator based on the resampling data may significantly improve the numeric performance of estimates of the conditional distribution of θ given Yn, or related prediction intervals, conditional expectations and so on. In Experiment 2, it can be seen from Table  that in approximately 52% of the simulated datasets, the Prasad–Rao estimator based on the original data obtained a value less than ϵ. Other details are consistent with the findings from Experiment 1.

Table 1. Percentages of cases where different estimators of ψ was below or above ε=104 in Experiment 1.

Table 2. Percentages of cases where different estimators of ψ was below or above ε=104 in Experiment 2.

We obtain several plots from Experiment 2. In Figure , in the top panel, we present in plot (1) the boxplot of ψˆn from the 200 replications of the simulation experiment. We also select two such replications at random, one corresponding to a dataset where ψˆnε and the other where ψˆn>ε. In the top panel of Figure , plots (2) and (3) are the resampling estimator ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in the randomly selected dataset for which ψˆnε. Plot (2) is from all resamples, and plot (3) is based on only the resamples where ψˆnr>ε. Plots (4) and (5) are ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in the randomly selected dataset for which ψˆn>ε, respectively, based on all resamples and only the resamples where ψˆnr>ε. Plot (6) is from the ‘Mix’ estimator proposed by Rubin-Bleuer and You (Citation2016), which is always guaranteed to be positive. While the Mix estimator is always positive, notice that is also seems quite biased.

Figure 1. Boxplots of the Prasad–Rao (PR) estimators from Experiment-2. In the top panel, plot (1) is the boxplot of ψˆn from the 200 replications of the simulation experiment. The rest of the boxplots are from two randomly selected instances of these replications. Plots (2) and (3) are ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in a case where ψˆnε. Plots (4) and (5) are ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in a case where ψˆn>ε. Plot (6) is the Mix estimator. In the bottom panel, we consider a randomly selected case where ψˆn>ε. The first two boxplots, are from multinomial bootstrap, corresponding respectively to cases where we use all the resamples, and only the ones where ψˆnr>ε. The third and fourth are from moon bootstrap and fifth and sixth are from Bayesian bootstrap for all resamples and from resamples with ψˆnr>ε, respectively. Plot (7) is the Mix estimator.

Figure 1. Boxplots of the Prasad–Rao (PR) estimators from Experiment-2. In the top panel, plot (1) is the boxplot of ψˆn from the 200 replications of the simulation experiment. The rest of the boxplots are from two randomly selected instances of these replications. Plots (2) and (3) are ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in a case where ψˆn≤ε. Plots (4) and (5) are ψˆnr obtained using multinomial weights corresponding to the paired bootstrap in a case where ψˆn>ε. Plot (6) is the Mix estimator. In the bottom panel, we consider a randomly selected case where ψˆn>ε. The first two boxplots, are from multinomial bootstrap, corresponding respectively to cases where we use all the resamples, and only the ones where ψˆnr>ε. The third and fourth are from moon bootstrap and fifth and sixth are from Bayesian bootstrap for all resamples and from resamples with ψˆnr>ε, respectively. Plot (7) is the Mix estimator.

In the bottom panel of Figure , we consider a randomly selected case where ψˆn>ε. The first two boxplots are from multinomial bootstrap, corresponding, respectively, to cases where we use all the resamples, and only the ones where ψˆnr>ε. The third and fourth are from moon bootstrap and fifth and sixth are from Bayesian bootstrap for all resamples and from resamples with ψˆnr>ε, respectively. The last plot is from the Mix estimator, which is again quite biased. The red horizontal line is at the value of ψˆn for that dataset. It can be seen that all the resampling estimators perform well, especially in cases where ψˆn>ε. For the cases where ψˆnε, the resampling-based estimators offer the alternative of obtaining a higher value of the estimator, while retaining their theoretical consistency properties. Plots from Experiment 1, which we do not present for brevity, are consistent with the above findings.

5. Conclusions

In this paper, we have only studied the resampling-based estimator for the Prasad–Rao formulation: further studies are needed for other choices of estimators. Considerable additional numeric studies are needed. Studies are needed, both from a theoretical as well as computational viewpoint, on the effect of the choice of a resampling-based ψ for prediction of θ. One interesting special case of using resampling weights that needs additional exploration is when γ tends to zero with n, as in the case discussed in Chatterjee and Bose (Citation2002).

Acknowledgments

The author thanks the reviewers and editors for their comments, which helped improve the paper.

Disclosure statement

No potential conflict of interest was reported by the author.

ORCID

Snigdhansu Chatterjee http://orcid.org/0000-0002-7986-0470

Additional information

Funding

This research is partially supported by the National Science Foundation (NSF) [grant numbers # DMS-1622483 and # DMS-1737918].

Notes on contributors

Snigdhansu Chatterjee

Dr. Snigdhansu Chatterjee is Professor in the School of Statistics, and the Director of the Institute for Research in Statistics and its Applications (IRSA, http://irsa.stat.umn.edu/) at the University of Minnesota.

References

  • Barbe, P., & Bertail, P (1995). The weighted bootstrap, volume 98 of Lecture Notes in Statistics. New York, NY: Springer-Verlag.
  • Chatterjee, S. (1998). Another look at the jackknife: Further examples of generalized bootstrap. Statistics and Probability Letters, 40(4), 307–319.
  • Chatterjee, S. (2018). On modifications to linking variance estimators in the Fay-Herriot model that induce robustness. Statistics And Applications, 16(1), 289–303.
  • Chatterjee, S., & Bose, A. (2000). Variance estimation in high dimensional regression models. Statistica Sinica, 10(2), 497–516.
  • Chatterjee, S., & Bose, A. (2002). Dimension asymptotics for generalised bootstrap in linear regression. Annals of the Institute of Statistical Mathematics, 54(2), 367–381.
  • Chatterjee, S., & Bose, A. (2005). Generalized bootstrap for estimating equations. The Annals of Statistics, 33(1), 414–436.
  • Chatterjee, S., Lahiri, P., & Li, H. (2008). Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. The Annals of Statistics, 36(3), 1221–1245.
  • Das, K., Jiang, J., & Rao, J. N. K. (2004). Mean squared error of empirical predictor. The Annals of Statistics, 32(2), 818–840.
  • Datta, G. S., Hall, P., & Mandal, A. (2011). Model selection by testing for the presence of small-area effects, and application to area-level data. Journal of the American Statistical Association, 106(493), 362–374.
  • Datta, G. S., Rao, J. N. K., & Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika, 92(1), 183–196.
  • Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1–26.
  • Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Boca Raton, FL: Chapman & Hall/CRC press.
  • Fay III, R. E., & Herriot, R. A. (1979). Estimates of income for small places: an application of James-Stein procedures to census data. Journal of the American Statistical Association, 74(366), 269–277.
  • Hirose, M. Y., & Lahiri, P (2017). A new model variance estimator for an area level small area model to solve multiple problems simultaneously. arXiv preprint arXiv:1701.04176.
  • Jiang, J., & Lahiri, P. (2006). Mixed model prediction and small area estimation (with discussion). Test, 15(1), 1–96.
  • Jiang, J., Lahiri, P., & Wan, S.-M. (2002). A unified jackknife theory for empirical best prediction with M-estimation. The Annals of Statistics, 30(6), 1782–1810.
  • Kubokawa, T., & Nagashima, B. (2012). Parametric bootstrap methods for bias correction in linear mixed models. Journal of Multivariate Analysis, 106, 1–16.
  • Li, H., & Lahiri, P. (2010). An adjusted maximum likelihood method for solving small area estimation problems. Journal of Multivariate Analysis, 101(4), 882–892.
  • Molina, I., Rao, J. N. K., & Datta, G. S. (2015). Small area estimation under a Fay–Herriot model with preliminary testing for the presence of random area effects. Survey Methodology, 41(1), 1–19.
  • Pfeffermann, D. (2013). New important developments in small area estimation. Statistical Science, 28(1), 40–68.
  • Pfeffermann, D., & Glickman, H. (2004). Mean square error approximation in small area estimation by use of parametric and nonparametric bootstrap. ASA Section on Survey Research Methods Proceedings, Alexandria, VA, pp. 4167–4178.
  • Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85(409), 163–171.
  • Præstgaard, J., & Wellner, J. A. (1993). Exchangeably weighted bootstraps of the general empirical process. The Annals of Probability, 21(4), 2053–2086.
  • Rao, J. N. K. (2015). Inferential issues in model-based small area estimation: Some new developments. Statistics in Transition, 16(4), 491–510.
  • Rao, J. N. K., & Molina, I. (2015). Small area estimation. New York, NY: Wiley.
  • Rubin-Bleuer, S., & You, Y. (2016). Comparison of some positive variance estimators for the Fay–Herriot small area model. Survey Methodology, 42(1), 63–85.
  • Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York, NY: Springer-Verlag.
  • Yoshimori, M., & Lahiri, P. (2014a). A new adjusted maximum likelihood method for the Fay–Herriot small area model. Journal of Multivariate Analysis, 124, 281–294.
  • Yoshimori, M., & Lahiri, P. (2014b). A second-order efficient empirical Bayes confidence interval. The Annals of Statistics, 42(4), 1233–1261.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.