368
Views
1
CrossRef citations to date
0
Altmetric
Articles

Meta-analysis of independent datasets using constrained generalised method of moments

&
Pages 109-116 | Received 03 Feb 2019, Accepted 07 Jun 2019, Published online: 18 Jun 2019

Abstract

We propose a constrained generalised method of moments (CGMM) for enhancing the efficiency of estimators in meta-analysis in which some studies do not measure all covariates associated with the response or outcome. Under some assumptions, we show that the proposed CGMM estimators have good asymptotic properties. We also demonstrate the effectiveness of the proposed method through simulation studies with fixed sample sizes.

1. Introduction

Because of the availability of multiple datasets, not just summary statistics, from different studies in modern applications, meta-analysis has become an important tool to gain efficiency in estimating a common structural parameter vector of interest from all studies by appropriately using multiple datasets (Hartung, Knapp, & Sinha, Citation2008; Higgins & Thompson, Citation2002; Higgins, Thompson, Deeks, & Altman, Citation2003; Schmidt & Hunter, Citation2014; Simonian & Laird, Citation1986). There exists a rich literature on how to form optimal calibration equations for improving the efficiency of parameter estimates within various classes of unbiased estimators (Chen & Chen, Citation2000; Deville & Sarndal, Citation1992; Lumley, Shaw, & Dai, Citation2011; Robins, Rotnitzky, & Zhao, Citation1994; Slud & DeMissie, Citation2011; Wu, Citation2003; Wu & Sitter, Citation2001). The methodology for ‘model-based’ maximum likelihood estimation has also been studied previously in some special cases of this problem (Chatterjee, Chen, Maas, & Carroll, Citation2016). A number of researchers have proposed semiparametric maximum likelihood methods for various types of regression models, while accounting for complex sampling designs (Breslow & Holubkov, Citation1997; Lawless, Wild, & Kalbfleisch, Citation1999; Qin, Zhang, Li, Albanes, & Yu, Citation2015; Rao & Molina, Citation2015; Scott & Wild, Citation1997).

One issue that has to be addressed with multiple studies is that some studies may not measure all covariates although all studies have the same responses (Chatterjee et al., Citation2016). Specifically, a past study only measured q of the p+q covariates measured in the current study. Although unobserved covariate values in the past study can be treated as missing covariate values, better statistical procedure may be derived because in each study, a covariate is either observed or missing entirely, which is referred to as systematic missing covariates.

To illustrate the idea, let us consider the special case of two studies. Let Y be a response or outcome of interest, U and X be p- and q-dimensional vectors of associated covariates measured in study 1, and X be the covariate vector measured only in study 2. We focus on the situation where whether U is observed does not affect the conditional means, i.e., (1) E(YU,X,δ)=E(YU,X)andE(UX,δ)=E(UX),(1) where δ=k for study k=1,2. In the missing data literature, the ‘missingness’ of U with property (Equation1) is referred to as missing at random, but not missing completely at random.

Suppose that we are interested in the parameters in the conditional mean E(YU,X), which can be called structural parameters. From the first equation in (Equation1), estimation can actually be done using data from study 1. However, we want to make use of data from study 2, which is the purpose of meta-analysis. The second equation, which will be referred to as bridge equation, may enable us to obtain estimators based on data from all studies that are more efficient than those using only data from study 1.

In this article, we assume that the conditional means in (Equation1) follow linear models for both observation and bridge equation. Although more complicated models may be encountered in applications, the discussion with linear models is a good start to this problem. In Section 2, we propose a constraint generalised method of moments for estimation in the case of two studies. Asymptotic distributions of the proposed estimators are established, with which we illustrate when more asymptotically efficient estimators can be obtained. Simulation studies support our asymptotic results and illustrate the magnitude of efficiency gain. Our method can be extended to the case of more than two studies. As an example of extension, in Section 3, we consider the situation of three studies and supplemented with simulation results. The last section contains some technical details.

2. Results for two independent studies

In this section, we consider two studies, indicated by δ{1,2}, with independent datasets. Following Section 1, we use Y, X, and U as the response of interest, the covariate vector measured in both studies, and the covariate vector measured only in study 1, respectively.

2.1. Constrained generalised method of moments

For illustration, we first consider a univariate U=U. Assume (Equation1) and linear models for two independent studies as follows: (2) δ=1Y=βuU+βxTX+ε1(2) (3) δ=2Y=ηxTX+ε2(3) (4) bridgeU=γxTX+εb,ηx=βuγx+βx(4) where ε1, ε2, and εb are independent with mean 0 and variances σ12, σ22, and σb2, respectively, βu, βx, ηx, and γx are parameter vectors with appropriate dimensions, and the superscript T denotes vector transpose. Models (Equation2)–(Equation4) assume that the structure parameters βu, βx, and γx are the same for all studies, while the distributions of ϵ's can vary with studies, exhibiting the heteroscedasticity of the data among different studies.

We are mainly interested in estimating βu and βx in (Equation2). Instead of using data from study 1 only, we try to make use of data from study 2 to gain estimation efficiency. Condition (Equation4) is for bridging data in two studies to gain efficiency by using the additional data from study 2. It is not necessary. See our discussion in Section 4.

Assume that we have two independent random samples with sizes n1 and n2 from studies 1 and 2, respectively. We denote the total sample size from all studies as n=n1+n2. From (Equation2)–(Equation4), we construct estimating equations Eg(Z,θ)=0 and a constraint c(θ)=0, where 0 is the vector of zeros, θ=(βu,βxT,γxT,ηxT)T, c(θ)=ηxβuγxβx, Z=(Y,U,XT,δ)T, g(Z,θ) is a column vector with elements of I(δ=1)(n/n1)(βuU+βxTXY)U,I(δ=1)(n/n1)(βuU+βxTXY)X,I(δ=1)(n/n1)(γxTXU)X,I(δ=2)(n/n2)(ηxTXY)X, where I(A) is the indicator function of the event A.

Let zi=(yi,ui,xiT,δi)T, i=1,,n, be observed data from samples, where (yi,ui,xiT,δi)T with δi=k are identically distributed as (Y,U,XT,δ)T with δ=k, k=1,2, and let g¯(θ)=n1i=1ng(zi,θ). The two step constrained generalised method of moments (CGMM) is applied as follows.

  1. Compute θ~c=argmin[g¯(θ)Tg¯(θ)] over θ with constraint c(θ)=0.

  2. Compute the weight matrix Wˆ=n[i=1ng(zi,θ~c)g(zi,θ~c)T]1.

  3. Compute the two step CGMM estimator θˆc=argmin[g¯(θ)TWˆg¯(θ)] over θ with constraint c(θ)=0.

We now extend our idea to a multivariate U that is observed in study 1 but not in study 2. Let U be p-dimensional and Uj be its jth component. Then, the previous procedure can still be applied with U, βu, βuU, γx, ε3, (n/n1)I(δ=1)(γxXTU)X, and c(θ)=ηxβuγxβx replaced by U, βu=(βu1,,βup)T, βuTU, (γx1,,γxp), ε3=(ε31,,ε3p)T, (n/n1)I(δ=1)((γx1TXU1)XT,,(γxpTXUp)XT), and c(θ)=ηxβuT(γx1,,γxp)Tβx), respectively.

2.2. Asymptotic properties

The general theory for the generalised method of moment (GMM) is given in Hansen (Citation1982). The CGMM we proposed in Section 2.1 adds a constraint to the GMM. For the purpose of testing hypotheses, Engle and McFadden (Citation1994) considered the CGMM. We now establish an asymptotic result in a similar manner. For simplicity, we consider only a univariate U.

Let θ0 denote the true value of the parameter vector θ. For the CGMM estimator θˆc defined in Section 2.1 with the constraint c(θ)=0, we have the following result.

Theorem 2.1

Assume that models (Equation2)–(Equation4) hold; θ0 is the unique root of Eg(Z,θ)=0; both n1 and n2 diverge to ∞ and n1/nh with 0<h<1; and the matrices Ω=limnEg(Z,θ0)g(Z,θ0)T,G=limnE[g(Z,θ)/θT]|θ=θ0,Σx=E(XXT), and A=[c(θ)/θT]|θ=θ0=(γx,Iq,βuIq,Iq) are all of full rank, where Iq is the identity matrix of order q. Then, (5) n1/2(θˆcθ0)dN(0,BBAT(ABAT)1AB),(5) where B=(GTΩ1G)1 and d denotes convergence in distribution as n.

If we do not use the constraint c(θ)=0, then the unconstraint GMM (UGMM) estimator in our problem described in Section 2.1 is the vector of the least square estimators of βu, βu, and γx based on data in study 1 only and the least squares estimator of ηx based on data in study 2 only. Let θˆ0 be the UGMM estimator. Then (6) n1/2(θˆ0θ0)dN0,B,(6) which can be derived in the same manner as deriving (Equation5) but with c(θ)=0.

Is the CGMM estimator θˆc asymptotically more efficient than the UGMM estimator θˆ0 because of utilising two data sets? It follows from results (Equation5) and (Equation6) that a component of θˆc is asymptotically more efficient if and only if the corresponding diagonal element of the matrix BAT(ABAT)1AB is positive.

To find out the magnitude of efficiency gains in using CGMM, we need to address the issue that the limit h of sample size ratio may be different from 1/2, and need to derive more explicitly the asymptotic covariance matrices in (Equation5) and (Equation6).

Note that the first 1+2q components of θˆc, denoted by ζˆc, estimates ζ=(βu,βxT,γxT)T based on data from study 1 with size n1, whereas the last q components of θˆc, denoted by ηˆxc, estimates ηx based on data from study 2 with size n2. From the technical details in Section 5, we obtain from (Equation5) that (7) n11/2(ζˆcζ)n21/2(ηˆxcηx)dN×(0,HBHHBAT(ABAT)1ABH),(7) where H is a diagonal matrix whose first 1+2q diagonal elements are h1/2 and last q diagonal elements are (1h)1/2. For the special case where δ and (U,X) in (Equation1) are independent (so that missing U is completely at random), it is further shown in Section 5 that (8) HBH=σ12/σb2σ12γxT/σb2σ12γx/σb2σ12(Σx1+γγT/σb2)00000000σb2Σx100σ22Σx1,(8) where 0 denotes a column or row vector of zeros or a matrix of zeros with an appropriate dimension, and that (9) HBAT(ABAT)1ABH=1Δ000D,(9) where Δ=(1h)(σ12+σb2βu2)+hσ22 and (10) D=σ14(1h)Σx1σ12σb2βu(1h)Σx1σ12σ22[(1h)h]1/2Σx1σ12σb2βu(1h)Σx1σb4βu2(1h)Σx1σ12σb2βu[(1h)h]1/2Σx1σ12σ22[(1h)h]1/2Σx1σ12σb2βu[(1h)h]1/2Σx1σ24hΣx1.(10) Similarly, if ζˆ0 and ηˆx0 denote the UGMM estimators of ζ and ηx, then (11) n11/2(ζˆ0ζ)n21/2(ηˆx0ηx)dN(0,HBH).(11) We define the asymptotic relative efficiency gain in using CGMM estimator θˆcj, the jth component of θˆc, with respect to the unconstraint GMM estimator θˆ0j, the jth component of θˆ0, to be Rj=the asymptotic variance of  θˆ0jthe asymptotic variance of θˆcjthe asymptotic variance of\ {θˆ0j}j=1,,1+3q. From (Equation7) and (Equation11), we derive Rj's as follows. First, R1=0, i.e., there is no gain in estimating βu. Intuitively, this is because the data set in study 2 does not have information on U. Second, for estimating q components of βx, Rj=(1h)σb2βu2σ(j1)Δσ12(σb2σ(j1)+γx(j1)2),j=2,,q+1, where σ(t) is the tth diagonal element of the matrix Σx1 and γxt is the tth component of γx. Third, for estimating q components of γx, Rj=(1h)σb2βu2Δ,j=q+2,,2q+1. Note that Δ1(1h) is a decreasing function of h. Hence the CGMM estimators of components of βx and γx are increasingly more efficient when h decreases, i.e., n2/n1 increases, which means more information can be borrowed from study 2. Finally, for estimating q components of ηx, Rj=hσ22Δ,j=2q+2,,3q+1, which increases when h increases.

2.3. Simulation study

Two simulation studies are carried out to check the empirical performance of the CGMM and UGMM estimators with finite fixed sample sizes. In the first simulation, we consider univariate U and X, i.e., p=q=1. The covariate is generated from the standard normal distribution. The covariate U and response Y are generated according to (Equation2)–(Equation4) with ε1, ε2, and εb independently distributed as standard normal.

Based on 2000 simulations, Table  gives the simulation variances of estimators of univariate parameters βu, βx, γx, and ηx, for both CGMM and UGMM. All simulation biases are less than 0.006 and thus not reported. True values of parameters and different sample sizes are included in Table .

Table 1. Simulation variances of CGMM and UGMM estimators (p=q=1).

A few conclusions can be made from Table .

  1. When n1=n2=100, the simulation relative efficiency gain of CGMM over UGMM is (0.58%,14.92%,31.69%,33.82%) for estimating (βu,βx,γx,ηx). This indicates that there is almost no improvement in estimating βu, but there are substantial gains in estimating other 3 parameters, which supports our asymptotic result discussed in Section 2.2. In fact, the vector of asymptotic relative efficiency gains in theory defined in Section 2.2 is (0,1/6,1/3,1/3), which is very close to the simulation relative gains.

  2. When n1=100 and n2=400, more information from study 2 can be borrowed to estimate parameters in study 1. The simulation relative efficiency gain vector is (0.63%,20.91%,44.99%,13.76%). We have more gains in estimating βx and γx, but less gain in estimating ηx. The vector of asymptotic relative efficiency gains in theory defined in Section 2.2 is (0,2/9,4/9,1/9), which is very close to the simulation relative gain.

  3. When n1=400 and n2=100, the simulation relative efficiency gain vector is (0.2%,7.14%,15.32%,68.18%). We have less gains in estimating βx and γx, but more gain in estimating ηx. the vector of asymptotic relative efficiency gains in theory defined in Section 2.2 is (0,1/12,1/6,2/3), which is very close to the simulation relative gain.

Our second simulation considers a q=2 dimensional X=(X1,X2)T, while U is still univariate. Data are generated according to (Equation2) – (Equation4) with X being a two-dimensional normal with zero marginal means, unit marginal variances, and a correlation ρ.

Note that βx=(βx1,βx2)T, γx=(γx1,γx2)T, and ηx=(ηx1,ηx2)T are all 2-dimensional. Based on 2000 simulations, Table  gives the simulation variances of estimators of βu, βx1, βx2, γx1, and γx2 for both CGMM and UGMM. Results for the variances of estimators of ηx1 and ηx2 are omitted. Again, all simulation biases are less than 0.004 and thus not reported. True values of parameters are included in Table . Sample size n1=n2=100 and ρ=0, 0.3, and 0.6 are considered.

Table 2. Simulation variances of CGMM and UGMM estimators (p=1, q=2).

Table  shows similar results to those in Table . In estimating βx, the simulation relative gain of CGMM over UGMM ranges from 10% to 20%, while there is no gain in estimating βu. Increasing the value of ρ, the correlation between two components of X increases the relative efficiency gain, but not substantially.

3. Results for three independent studies

The method and results in Section 2 can be extended to various situations where the number of independent studies is more than 2 and different covariates are observed in different studies. We consider in this section the case of three studies where the response Y and covariates U, V, and X are observed according to the following with sample sizes in three studies: StudyObserved Sample sizeδ=1δ=2δ=3YUVXYUXYVXn1n2n3 The total sample size from all studies is n=n1+n2+n3.

3.1. CGMM

Similar to (Equation1) and (Equation2)–(Equation4), we assume that (12) E(YU,V,X,δ)=E(YU,V,X),(12) (13) E(VU,X,δ)=E(VU,X),(13) (14) E(UV,X,δ)=E(UV,X),(14) δ=1,2,3, and that (15) δ=1Y=βuTU+βvTV+βxTX+ε1(15) (16) δ=2Y=ηuTU+ηxTX+ε2(16) (17) δ=3Y=τvTV+τxTX+ε3(17) (18) bridgeU=γuvV+γuxX+εb,V=γvuU+γvxXεbwith γuvγvu=Iandγuvγvx+γux=0(18) where p×q matrix γux=(γux1,,γuxp)T, p×l matrix γuv=(γuv1,,γuvp)T, l×q matrix γvx=(γvx1,,γvxl)T, and l×p matrix γvu=(γvu1,,γvul)T. Assume samples are independent and identically distributed within each study and independent among studies and ϵ's are independent with mean zero. By assumptions (Equation12), (Equation14), (Equation15), (Equation17), and the expression of U in (Equation18), we have the following constraint conditions: (19) βuTγuv+βvT=τvTandβuTγux+βxT=τxT.(19) By assumptions (Equation12), (Equation13), (Equation15), (Equation16), and the expression of V in (Equation18), we have the following constraint conditions: (20) βuT+βvTγvu=ηuTandβvTγvx+βxT=ηxT.(20) Denote (βuT,βvT,βxT)T, (ηuT,ηxT)T, and (τvT,τxT)T by β, η, and τ, respectively. Models (Equation15)–(Equation18) assume that the structure parameters β, γvu, γvx, γuv, and γux are the same for all studies, while the distributions of ϵ's can vary with studies. We are mainly interested in estimating β in (Equation15). Instead of using data from study 1 only, we try to make use of data from studies 2 and 3 to gain estimation efficiency. Condition (Equation18) is needed for bridging data among three studies; without this condition, it is hard to gain any efficiency by using the additional data from study 2 and 3.

Denote vec(M) a row vector contains all rows in a matrix M. From (Equation15)–(Equation20), we construct estimating equations Eg(Z,θ)=0 and a constraint c(θ)=0, where Z=(Y,UT,VT,XT,δ)T, θ=(βT,ηT,τT,vec(γuv),vec(γux),vec(γvu),vec(γvx))T, c(θ)=(βuTγuv+βvTτvT,βuTγux+βxTτxT,βuT+βvTγvuηuT,βvTγvx+βxTηxT,vec(γuvγvuI),vec(γuvγvx+γux))T,g(Z,θ) is a column vector with elements of I(δ=1)(n/n1)[βuTU+βvTV+βxTXY]×(UT,VT,XT)T,I(δ=2)(n/n2)[ηuTU+ηxTXY](UT,XT)T,I(δ=3)(n/n3)[τvTV+τxTXY](VT,XT)T,I(δ=1)(n/n1)((γuv1TV+γux1TXU1)(VT,XT),,×(γuvpTV+γuxpTXUp)(VT,XT))T,I(δ=1)(n/n1)((γvu1TU+γvx1TXV1)(UT,XT),,×(γvulTU+γvxlTXVl)(UT,XT))T. Let zi=(yi,uiT,viT,xiT,δi)T, i=1,,n, be independent samples, where (yi,uiT,viT, xiT,δi)T with δi=k are identically generated from the distribution of (Y,UT,VT,XT,δ)T with δ=k, k=1,2,3. Define g¯(θ)=n1i=1ng(Zi,θ). The two step CGMM is applied as follows.

  1. Compute θ~c=argmin[g¯(θ)Tg¯(θ)] over θ with constraint c(θ)=0.

  2. Compute the weight matrix Wˆ=n[i=1ng(zi,θ~c)g(zi,θ~c)T]1.

  3. Compute the two step CGMM estimator θˆc=argmin[g¯(θ)TWˆg¯(θ)] over θ with constraint c(θ)=0.

Asymptotic property of CGMM estimator θˆc can be established similarly to Theorem 2.1.

3.2. Simulation study

In this section, we consider univariate U and V, i.e., p=l=1. Then βu, βv, ηu, τv, γvu, γuv, and εb reduce to scalars βu, βv, ηu, τv, γvu, γuv, and εb, respectively. Two simulation studies are carried out to check the empirical performance of the CGMM and UGMM estimators with finite fixed sample sizes. In the first simulation, we consider univariate X, i.e., q=1. Then βx, ηx, τx, γux, and γvx reduce to scalars βx, ηx, τx, γux, and γvx, respectively. The covariate X and V are independently generated from the standard normal distribution. The covariate U and response Y are generated according to (Equation15)–(Equation18) with ε1, ε2, ε3, and εb independently distributed as standard normal.

Based on 2000 simulations, Table  gives the simulation variances of estimators of parameters βu, βv, and βx for both CGMM and UGMM. Results for the variances of estimators of other parameters are omitted. All simulation biases are less than 0.007 and thus not reported. True values of parameters and different sample sizes are included in Table .

Table 3. Simulation variances of CGMM and UGMM estimators (p=l=q=1).

It can be seen that messages provided by Table  are very similar to those from Table . When n1=n2=n3=100, the simulation relative efficiency gain of CGMM over UGMM is (83.04%,62.05%,54.97%) for estimating (βu,βv,βx). When n1=100, n2=100 and n2=400, more information from study 3 can be borrowed to estimate parameters in study 1, and the simulation relative efficiency gain vector is (85.11%,65.35%,63.08%) for estimating (βu,βv,βx). When n1=100, n2=400 and n2=100, more information from study 2 can be borrowed to estimate parameters in study 1, and the simulation relative efficiency gain vector is (88.46%,68.25%,59.15%) for estimating (βu,βv,βx).

Our second simulation considers a q=2 dimensional X=(X1,X2)T, while U and V are still univariate. Data are generated according to (Equation15)–(Equation18) with X being two dimensional normal with zero marginal means, unit marginal variances, and a correlation ρ.

Based on 2000 simulations, Table  gives the simulation variances of estimators of βu, βv, βx1, and βx2 for both CGMM and UGMM. Results for the variances of estimators of other parameters are omitted. Again, all simulation biases are less than 0.004 and thus not reported. True values of parameters are included in Table . Sample size n1=n2=n3=100 and ρ=0, 0.3, and 0.6 are considered. The results show substantial improvement of CGMM over UGMM, and the effect of ρ is not substantial.

Table 4. Simulation variances of CGMM and UGMM estimators (p=l=1, q=2).

Different from Tables  and , the simulation relative efficiency gain in estimating βu is not zero in Tables  and . This is because that the additional independent study δ=2 provides more information for CGMM in estimating βu. The same conclusion can be made for estimating βv.

4. Discussion

We have proposed a CGMM estimator for using information from datasets in different studies. An asymptotic theorem is established in the case of two studies to illustrate that the CGMM estimator is more efficient than the UGMM estimator using data from one study only. Our simulation studies show that the CGMM estimators can achieve major efficiency gains over the UGMM estimators in cases with two or three studies.

Comparing results for three studies with those for two studies, we conclude that the conclusions are similar, but the CGMM procedure is more complicated with three studies. This is still true if we encounter more studies. The improvement of the CGMM over the UGMM (which basically uses within-study data) increases as the number of studies increases, since more datasets are involved when more studies are considered. However, the derivation of CGMM may be messy when there are many studies and datasets.

We consider linear models for data in both observation and bridge patterns. This is not necessary and can be extended. For example, assumptions (Equation3) and (Equation4) may be replaced by a more general assumption on E(UX), either parametric or nonparametric. More research is needed to extend the framework and to explore methods that can handle more general model assumptions.

5. Technical details

Proof of Theorem 2.1

Note that g¯(θ0) is a sample average of i.i.d. random vectors with mean zero and finite covariance matrix Ω. Then the Lindeberg–Levy central limit theorem implies (21) Tn=Ω1/2n1/2g¯(θ0)dN(0,I1+3q).(21) Define a Lagrangian for θˆc:Ln(θ,λ)=Qn(θ)c(θ)Tλ, where Qn(θ)=g¯(θ)TWˆg¯(θ). In this expression, λ is a column vector of undetermined Lagrangian multipliers; these are non-zero when the constraints are binding. The first-order conditions for solution of the constrained optimisation problem are (22) 00=n1/2θQn(θ)|θ=θˆcθc(θ)T|θ=θˆcn1/2λn1/2c(θˆc).(22) Let Gn(θ)=n1i=1nθg(zi,θ). Since θ~c is a consistent estimator of θ0, Gn(θ~c)G=op(1) and WˆΩ1=op(1), where op(1) denotes a sequence of random vectors converging to zero in probability. Using these results and Taylor expansions, we have n1/2g¯(θˆc)=n1/2g¯(θ0)Gn(θˆc)n1/2(θˆcθ0)=Ω1/2TnGn1/2(θˆcθ0)+op(1),n1/2c(θˆc)=n1/2c(θ0)+An1/2(θˆcθ0)+op(1)=An1/2(θˆcθ0)+op(1) and n1/2θQn(θ)|θ=θˆc=GTΩ1n1/2g¯(θˆc)+op(1). Substituting these into the first-order conditions in (Equation22) yields (23) n1/2(θˆcθ0)n1/2λ=GTΩ1GATA01GTΩ1/2Tn0+op(1).(23) Applying the formula for the inverse of partitioned matrix (Lu & Shiou, Citation2002) to (Equation23) and the fact that B=(GTΩ1G)1 yields (24) n1/2(θˆcθ0)n1/2λ=BBAT(ABAT)1AB(ABAT)1AB×GTΩ1/2Tn+op(1).(24) Note that B=(GTΩ1G)1 and B=BT yield {[BBAT(ABAT)1AB]GTΩ1/2}×{[BBAT(ABAT)1AB]GTΩ1/2}T=[BBAT(ABAT)1AB]B1×[BBAT(ABAT)1AB]T=[IBAT(ABAT)1A][BBAT(ABAT)1AB]=B2BAT(ABAT)1AB+BAT(ABAT)1ABAT(ABAT)1AB=BBAT(ABAT)1AB. Then, result (Equation5) follows from (Equation21) and (Equation24).

Proofs of (Equation7)–(Equation10). Let Hn be a diagonal matrix whose first 1+2q diagonal elements to be hn1/2 and last q diagonal elements to be (1hn)1/2. Then HnH, which together with (Equation5) imply result (Equation7).

To complete the proof, we now give derivations of (Equation8) – (Equation10). Write Ω=Ω11Ω12Ω13Ω12TΩ22Ω23Ω13TΩ23TΩ33,G=G11G12G13G12TG22G23G13TG23TG33, where Ω11 is (1+q)×(1+q), Ω22 and Ω33 are both q×q, and the dimension of Gij is the same as that of Ωij. By hnh0, and the definitions of Ω and g, Ω11=limnEn2n12I(δ=1)(βuU+βxTXY)2(U,XT)T(U,XT)=h2E[(ε1)2]E[I(δ=1)]E[(U,XT)T(U,XT)]=σ12hE(U2)E(UXT)E(UXT)TE(XXT)=σ12hKwith K=γxTΣxγx+σb2γxTΣxΣxγxΣx, where the second equation follows from the assumption that δ and (U,X) are independent and ε1 is independent of (δ,U,X), the third equation follows from E[I(δ=1)]=h and ε1 has variance σ12, and the last equation follows from (Equation4) and the assumption that εb with mean zero is independent of X so that E(UXT)=E(γxTXXT+εbXT)=γxTΣx and E(U2)=E((γxTX+εb)2)=γxTΣxγx+σb2. Similarly, by E[I(δ=1)]=h and the assumption that δ and (U,X) are independent, we have Ω22=limnEn2n12I(δ=1)(γxTXU)2XXT=σb2hΣx,Ω33=limnEn2n22I(δ=2)(ηxTXY)2XXT=σ221hΣx,Ω12=limnEn2n12I(δ=1)(βuU+βxTXY)×(γxTXU)(U,XT)TXT=h1Eε1E[εb(U,XT)TXT]=0, where the last equation is guaranteed by the assumption that Eε1=0. Since I(δ=1)I(δ=2)=0, Ω13 and Ω23 are 0. Thus, Ω1=hσ12K1000hσb2Σx10001hσ22Σx1with K1=1σb21γxTγxσb2Σx1+γxγxT. By the definitions of G and g, partial derivatives corresponding to blocks other than diagonal blocks in G are zero, i.e., G12, G13, and G23 are 0. By E[I(δ=1)]=h, the assumptions that hnh0, and δ is independent of (U,X), we have G11=limnEnn1I(δ=1)(U,XT)T(U,XT)=K,G22=limnEnn1I(δ=1)XXT=Σx,G33=limnEnn2I(δ=2)XXT=Σx. Combining these results, we obtain that (25) B=(GTΩ1G)1=σ12hK1000σb2hΣx1000σ221hΣx1.(25) By (Equation25) and the definition of H, we have the explicit form of HBH in (Equation8).

Note that A=[γx,Iq,βuIq,Iq]. the explicit form of HBAT(ABAT)1ABH in (Equation9) – (Equation10) follows from (Equation25), the definition of H, and HBAT=(0,h1/2σ12Σx1,h1/2σb2βuΣx1,(1h)1/2σ22Σx1)T,ABAT=[h1σ12+h1(σbβu)2+(1h)1σ22]Σx1.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

Research supported by the National Natural Science Foundation of China [grant 11831008], and the U.S. National Science Foundation – Division of Mathematical Sciences (US) [grant DMS-1612873].

Notes on contributors

Menghao Xu

Menghao Xu is a doctoral candidate in college of statistics, East China Normal University. His main research direction is variable selection, missing data and survival analysis.

Jun Shao

Jun Shao is a professor in department of University of Wisconsin-Madison and in college of statistics, East China Normal University. His research covers a wide range of fields, s.t. the jackknife, bootstrap and other resampling methods; variable selection and inference with high dimensional data; sample surveys (variance estimation, imputation for nonrespondents); missing data (nonignorable missing, dropout, semi-parametric methods); longitudinal data analysis with missing data and/or measurement error; medical statistics (clinical trials, personalized medicine, bioequivalence). He is the author of Mathematical Statistics, which is a wildly used graduate textbook covering topics in statistical theory essential for graduate students preparing for work on a Ph.D. degree in statistics.

References

  • Breslow, N. E., & Holubkov, R. (1997). Maximum likelihood estimation of logistic regression parameters under two-phase, outcome-dependent sampling. Journal of the Royal Statistical Society, Series B, 59(2), 447–461. doi: 10.1111/1467-9868.00078
  • Chatterjee, N., Chen, Y.-H., Maas, P., & Carroll, R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. Journal of the American Statistical Association, 111, 107–117. doi: 10.1080/01621459.2015.1123157
  • Chen, Y.-H., & Chen, H. (2000). A unified approach to regression analysis under double sampling design. Journal of the Royal Statistical Society, Series B, 62, 449–460. doi: 10.1111/1467-9868.00243
  • Deville, J. C., & Sarndal, C. E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association, 87, 376–382. doi: 10.1080/01621459.1992.10475217
  • Engle, R. F., & McFadden, D. L. (1994). Handbook of econometrics (Vol. 4). Amsterdam: Elsevier Science, North Holland.
  • Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50, 1029–1054. doi: 10.2307/1912775
  • Hartung, J., Knapp, G., & Sinha, K. B. (2008). Statistical meta-analysis with applications. New York, NY: Wiley.
  • Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. Statistics in Medicine, 21, 1539–1558. doi: 10.1002/sim.1186
  • Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring inconsistency in meta-analyses. British Medical Journal, 327, 557–560. doi: 10.1136/bmj.327.7414.557
  • Lawless, J. F., Wild, C. J., & Kalbfleisch, J. D. (1999). Semiparametric methods for response-selective and missing data problems in regression. Journal of the Royal Statistical Society, Series B, 61, 413–438. doi: 10.1111/1467-9868.00185
  • Lu, T. T., & Shiou, S. H. (2002). Inverses of 2×2 block matrices. Computers and Mathematics with Applications, 43, 119–129. doi: 10.1016/S0898-1221(01)00278-4
  • Lumley, T., Shaw, P. A., & Dai, J. Y. (2011). Connections between survey calibration estimators and semiparametric models for incomplete data. International Statistical Review, 79, 200–220. doi: 10.1111/j.1751-5823.2011.00138.x
  • Qin, J., Zhang, H., Li, P., Albanes, D., & Yu, K. (2015). Using covariate-specific disease prevalence information to increase the power of case- control studies. Biometrika, 102, 169–180. doi: 10.1093/biomet/asu048
  • Rao, J. N. K., & Molina, I. (2015). Small area estimation. New York, NY: Wiley.
  • Robins, J. M., Rotnitzky, A., & Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89, 846–866. doi: 10.1080/01621459.1994.10476818
  • Schmidt, F. L., & Hunter, J. E. (2014). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage Publications.
  • Scott, A. J., & Wild, C. J.. (1997). Fitting regression models to case-control data by maximum likelihood. Biometrika, 84, 57–71. doi: 10.1093/biomet/84.1.57
  • Simonian, R. D., & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7, 177–188. doi: 10.1016/0197-2456(86)90046-2
  • Slud, E., & DeMissie, D. (2011). Validity of regression meta-analyses versus pooled analyses of mixed linear models. Mathematics in Engineering, Science and Aerospace, 2, 251–265.
  • Wu, C. (2003). Optimal calibration estimators in survey sampling. Biometrika, 90, 937–951. doi: 10.1093/biomet/90.4.937
  • Wu, C., & Sitter, R. R. (2001). A model-calibration approach to using complete auxiliary information from survey data. Journal of the American Statistical Association, 96, 185–193. doi: 10.1198/016214501750333054

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.