775
Views
0
CrossRef citations to date
0
Altmetric
Articles

MLE with datasets from populations having shared parameters

&
Pages 213-222 | Received 04 Aug 2022, Accepted 01 Feb 2023, Published online: 04 Mar 2023

Abstract

We consider maximum likelihood estimation with two or more datasets sampled from different populations with shared parameters. Although more datasets with shared parameters can increase statistical accuracy, this paper shows how to handle heterogeneity among different populations for correctness of estimation and inference. Asymptotic distributions of maximum likelihood estimators are derived under either regular cases where regularity conditions are satisfied or some non-regular situations. A bootstrap variance estimator for assessing performance of estimators and/or making large sample inference is also introduced and evaluated in a simulation study.

1. Introduction

With advanced technologies in data collection and storage, in modern statistical analyses we often have multiple datasets as independent samples from different populations having shared parameters. Typically, one of these multiple datasets is primary with carefully collected data from a population of interest. The other datasets are from external sources, such as data from other studies, administrative records and publicly available information from internet.

On one hand, the fact that populations share common parameters provides a great opportunity for increasing statistical accuracy by utilizing multiple datasets instead of a single dataset. On the other hand, because of the difference in data collection, study purpose and/or time of investigation, heterogeneity often exists among populations so that we cannot simply combine all datasets into a single large dataset to run analysis, but must develop or modify statistical methodology to correctly utilize multiple datasets. The research on analysis with multiple datasets fits into a general framework of data integration (Kim et al., Citation2021; Lohr & Raghunathan, Citation2017; Merkouris, Citation2004; Rao, Citation2021; Yang & Kim, Citation2020; Zhang et al., Citation2017; Zieschang, Citation1990).

In this article, we study maximum likelihood estimation (MLE) for independent datasets with parametric populations sharing some (not necessarily all) parameters. For simplicity of presentation, we focus on the case of two independent datasets. The main idea and result can be extended to multiple datasets. Our research can also be extended to semi-parametric estimation, such as empirical likelihood or Cox regression for survival data.

Throughout, we consider two independent random samples. One random sample of size n, resulting a dataset {X1,,Xn}, is sampled from a parametric population with probability density f(x,θ,ϕ) (for either continuous or discrete x), where f is a known function and θ and ϕ are unknown parameter vectors. Another random sample of size m, resulting a dataset {Y1,,Ym}, is sampled from a population with probability density g(y,θ,φ), where g is a known function and θ and φ are unknown parameter vectors. Note that Xi and Yj can be vectors. The shared parameter θ can be either the main parameter vector of interest or a nuisance parameter vector, and ϕ and φ are other parameter vectors in two populations.

Let ϑ denote the vector with θ, ϕ, and φ as sub-vectors. In Section 2, we derive the maximum likelihood estimator (MLE) of ϑ based on two datasets, which is expected to be asymptotically more efficient than each MLE based on a single dataset, since more data are used for estimating the shared parameter θ, a component of ϑ. The asymptotic normality of MLE of ϑ is established when densities f and g satisfy regularity conditions that are typically assumed for MLE. Applications to location-scale problems are discussed in Section 3, where we also present a situation in which f or g does not satisfy the regularity conditions. Section 4 contains an example in which regularity conditions do not hold and MLE is not asymptotically normal. The common mean of a discrete data problem is considered in Section 5. Section 6 is devoted to the scenario where an additional uncertainty exists in the second population density g. To handle the situation where asymptotic normality of the MLE of ϑ is not available, we introduce a bootstrap variance estimator in Section 7 and provide some simulation results to examine finite sample performances.

2. MLEs with two datasets

The following are regularity conditions for probability density p(x,ϑ) (with a fixed ϑ) of a continuous or discrete random variable/vector X, typically assumed for MLEs in parametric populations (Shao, Citation2003).

(R1)

For every x in the range of X, p(x,ϑ) is twice continuously differentiable with respect to ϑ in an open set of the Euclidean space with a fixed dimension.

(R2)

ϑp(x,ϑ)dx=ϑp(x,ϑ)dx  and  ϑϑp(x,ϑ)dx=2ϑϑp(x,ϑ)dx, where C denotes the transpose of a vector or matrix C and the integral should be replaced by an appropriate summation when X is discrete.

(R3)

The Fisher information matrix E{2ϑϑlogp(X,ϑ)} exists and is positive definite,

(R4)

For any given ϑ, there exists a positive number cϑ and a positive function hϑ such that E{hϑ(X)}< and supγ:γϑ<cϑ2logp(x,γ)γγhϑ(x) for all x in the range of X, where A=trace(AA) for any matrix A.

In this section, we assume that both f and g satisfy regularity conditions (R1) –(R4). When some regularity conditions are not satisfied, we have to deal with the problem case by case. See, for example, the problem of normal and Laplace distributions in Section 3.2 and the problem of two truncation distributions in Section 4.

The log likelihood function of ϑ is (ϑ)=i=1nlogf(Xi,θ,ϕ)+j=1mlogg(Yj,θ,φ) and the score function is s(ϑ)=(ϑ)ϑ=(i=1nlogf(Xi,θ,ϕ)θ+j=1mlogg(Yj,θ,φ)θi=1nlogf(Xi,θ,ϕ)ϕj=1mlogg(Yj,θ,φ)φ). If ϑˆ is a solution to the score equation s(ϑ)=0, then we call ϑˆ an MLE of ϑ, although traditionally an MLE is defined as a maximizer of (ϑ) over the range of ϑ and ϑˆ satisfying s(ϑˆ)=0 may not be a maximizer.

A solution to the score equation often does not have an explicit form, even when each MLE of a single population has an explicit solution.

Under regularity conditions (R1)-(R4), E{s(ϑ)}=0 and Var{s(ϑ)}=E{s(ϑ)ϑ}=nI(ϑ) is the Fisher information matrix of information contained in two samples. Let Iθθ(θ,ϕ)=E{2logf(Xi,θ,ϕ)θθ},Iθθ(θ,φ)=E{2logg(Yj,θ,φ)θθ},Iθϕ(θ,ϕ)=E{2logf(Xi,θ,ϕ)θϕ},Iθφ(θ,φ)=E{2logg(Yj,θ,φ)θφ},Iϕϕ(θ,ϕ)=E{2logf(Xi,θ,ϕ)ϕϕ},Iφφ(θ,φ)=E{2logg(Yj,θ,φ)φφ}. Then I(ϑ)=(Iθθ(θ,ϕ)+aIθθ(θ,φ)  Iθϕ(θ,ϕ)  aIθφ(θ,φ)Iθϕ(θ,ϕ)  Iϕϕ(θ,ϕ)  0aIθφ(θ,φ)  0  aIφφ(θ,φ)) is positive definite, where a=m/n and without loss of generality we assume that m = an for a fixed a>0. It can be seen that I(ϑ) is increasing in a in the sense that AB for two non-negative definite matrices A and B if and only if AB is non-negative definite.

Using the standard argument in asymptotic theory, e.g., Theorem 4.17 in Shao (Citation2003), we obtain the following result.

Theorem 2.1

Assume (R1) –(R4) and that m = an with a remaining fixed as n increases. Then, with probability tending to 1 as n, there exists ϑˆ (depending on n) such that P{s(ϑˆ)=0}1 and (1) n(ϑˆϑ)dN(0,{I(ϑ)}1),(1) where d denotes convergence in distribution and N(C,D) is the normal distribution with mean C and covariance matrix D.

The asymptotic result (Equation1) enables us to assess performance of ϑˆ and to carry out large sample statistical inference on parameter ϑ or any of its components θ, ϕ, and φ. When some of regularity conditions (R1) –(R4) are not satisfied, however, we may apply the bootstrap method (see Section 3.2 and Section 7 for the normal and Laplace problem) or directly derive the asymptotic distribution of ϑˆ (see Section 4 for the problem of two truncation distributions).

3. Application to location-Scale problems

An application of our general result in Section 2 is to the case where f(x,θ,ϕ)=1σf(xμσ) and g(y,θ,φ)=1τg(xντ) for two continuous probability density functions f and g on real line, i.e., both populations are in location-scale families. We have several scenarios.

  1. Two location-scale families sharing the same location and scale parameters: μ=ν, σ=τ, θ=(μ,σ), and both ϕ and φ are constants.

  2. Two location-scale families sharing the same location parameter but having different scale parameters: μ=ν, θ=μ, ϕ=σ, and φ=τ.

  3. Two location-scale families sharing the same scale parameter but having different location parameters: σ=τ, θ=σ, ϕ=μ, and φ=ν.

Under any location-scale problem, it is often true that Iθϕ(θ,ϕ)=0 and Iθφ(θ,φ)=0 and, hence, the inverse of I(ϑ) can be easily obtained. For example, if both f and g are continuously differentiable functions symmetric about 0, then it follows from Example 3.9 in Shao (Citation2003) that both Iθϕ(θ,ϕ) and Iθφ(θ,φ) varnish.

In the following we consider a special case in details.

3.1. Normal and Laplace densities with a single scale parameter

Suppose that f(x,θ)=12πθex2/(2θ2), x(,), which is the normal distribution N(0,θ2), and that g(y,θ)=12θe|y|/θ, y(,), which is the Laplace distribution (also called double exponential distribution) with mean zero and standard deviation 2θ. The two densities share the common scale parameter θ>0.

The MLEs of θ based on data from f and g, respectively, are θˆN=1ni=1nXi2andθˆE=1mj=1m|Yj|. In this particular case, we can obtain an explicit form of the MLE θˆ of θ based on all data from two samples. The log likelihood based on two samples is (θ)=i=1nXi22θ2j=1m|Yj|θlog{(2π)n/22mθn+m}. The score function is s(θ)=1θ3i=1nXi2+1θ2j=1m|Yj|n+mθ. Setting s(θ)=0 and using the form of MLE from each sample, we obtain that θ2(mθˆEn+m)θ(nθˆN2n+m)=0. Since θ>0 and only one root is positive, we obtain that the MLE of θ is (2) θˆ=12{aθˆEa+1+(aθˆEa+1)2+4θˆN2a+1},where a=m/n.(2) Note that θˆ is a nonlinear function of θˆN and θˆE. In general, the MLE of the shared parameter based on two datasets is not a simple function of separate MLEs based on each single dataset.

To derive the asymptotic distribution of θˆ, we can use the general result (Equation1), because regularity conditions (R1) –(R4) are satisfied for f and g. Since θˆ has an explicit form, we can also simply derive it. Because Xi's and Yj's are independent and a=m/n, n(θˆNθaθˆEaθ)dN(0,(θ2/200θ2)). Define g(t,s)=12{asa+1+as2(a+1)2+4t2a+1}. Then, g(θˆN,aθˆE)=θˆandg(θ,aθ)=θ. Hence, by the delta method, e.g., Theorem 1.12 in Shao (2003), n(θˆθ)dN(0, g(θ2/200θ2)g), where g is the derivative vector of g at (t,s)=(θ,aθ), i.e., gt=2ta+1/as2(a+1)2+4t2a+1,gs=12{aa+1+as(a+1)2/as2(a+1)2+4t2a+1},g=(2a+2, aa+2). This leads to the following result.

Corollary 3.1

Assume that m = an with a remaining fixed as n increases. Then, as n, n(θˆθ)dN(0, θ2a+2).

The asymptotic relative efficiency of θˆN with respect to θˆ is 2/(a+2), which is decreasing in a and bounded between 0 and 1. The asymptotic relative efficiency of θˆE with respect to θˆ is a/(a+2), which is increasing in a and bounded between 0 and 1.

3.2. Normal and Laplace densities with shared scale and location parameters

Consider a more general case where f and g share a scale parameter and a location parameter. That is, f(x,θ,μ)=12πθe(xμ)2/(2θ2), x(,), which is the normal distribution N(μ,θ2), and g(y,θ,μ)=12θe|yμ|/θ, y(,), which is the Laplace distribution with mean µ and standard deviation 2θ. Note that regularity conditions (R1) –(R4) are not satisfied for g, since g is not always differentiable in µ.

For parameter vector ϑ=(μ,θ), the log likelihood is (ϑ)=i=1n(Xiμ)22θ2j=1m|Yjμ|θlog{(2π)n/22mθn+m}. Although (ϑ) is not always differentiable in µ, it is concave in µ and, hence, the MLE μˆ of µ exists though it does not have an explicit form. The MLE of θ is given by (Equation2) with θˆN and θˆE replaced by, respectively, 1ni=1n(Xiμˆ)2and1mj=1m|Yjμˆ|. The asymptotic distribution of ϑˆ=(μˆ,θˆ) cannot be obtained from (Equation1), since g does not satisfy conditions (R1) –(R4). For assessing performance of ϑˆ and/or making inference, we recommend a bootstrap method, which is discussed in Section 7 and studied by simulation.

4. Application to two truncation distributions

Let f(x,θ) and g(y,θ) be positive density functions on the interval (0,θ) and zero outside (0,θ), where θ>0 is an unknown scale parameter common for both populations, and f and g are known when θ is known. The likelihood is i=1nf(Xi,θ)I{Xi<θ}j=1mg(Yj,θ)I{Yj<θ}={i=1nf(Xi,θ)j=1mg(Yj,θ)}I{X(n)<θ}I{Y(m)<θ}, where IA is the indicator of event A, X(n)=max(X1,,Xn) and Y(m)=max(Y1,,Ym). This likelihood is not always differentiable in θ, but it can be seen that the MLE of θ is θˆ=max(X(n),Y(m)), a maximizer of the likelihood.

This is an example in which regularity conditions (R1) –(R4) in Section 2 are not satisfied so that result (Equation1) does not hold. The MLE θˆ is not even asymptotically normal. In the following we directly derive the asymptotic distribution of θˆ.

It follows from the result in Example 2.34 of Shao (Citation2003), the independence of Xi's and Yj's, and m = an that n(θX(n)θY(m))d(ϵ1f(θ,θ)ϵ2ag(θ,θ)), where ϵ1 and ϵ2 are independent random variables with the same exponential distribution having density ex, x>0. Because min{n(θX(n)),n(θY(m))}=n{θmax(X(n),Y(m))}=n(θθˆ), we obtain that n(θθˆ)dmin{ϵ1f(θ,θ),ϵ2ag(θ,θ)}. From the independence of ϵ1 and ϵ2, for any t>0, P{min{ϵ1f(θ,θ),ϵ2ag(θ,θ)}>t}=P{ϵ1f(θ,θ)>t}P{ϵ2ag(θ,θ)>t}=exp{t{f(θ,θ)+ag(θ,θ)}}. This leads to the following result.

Theorem 4.1

Under the assumed conditions on f and g in this section, n(θθˆ)dE(θ,a), where E(θ,a) is the exponential distribution with scale parameter 1/{f(θ,θ)+ag(θ,θ)}.

Inference on θ can be made using this asymptotic result.

The asymptotic relative efficiency of the MLE X(n) based on the first dataset with respect to the MLE θˆ based on two datasets is {1+ag(θ,θ)/f(θ,θ)}2, which is increasing in a and bounded between 0 and 1. The asymptotic relative efficiency of the MLE Y(m) based on the second dataset with respect to the MLE θˆ based on two datasets is {1+a1f(θ,θ)/g(θ,θ)}2, which is decreasing in a and bounded between 0 and 1.

5. Application to Poisson and binomial samples

Here we consider a discrete data problem, where Xi has the Poisson distribution with mean θ, Yj is binary with P(Yj=1)=θ, and θ(0,1) is the shared parameter. Let X¯ be the sample mean of Xi's and Y¯ be the sample mean of Yj's. The score function based on two samples is s(θ)=n(X¯θ1+aY¯θ1aY¯1θ),where a=m/n. Setting s(θ)=0, we obtain the score equation θ2(1+a+X¯)θ+X¯+aY¯=0. Since the score equation is a quadratic equation, it has two solutions if and only if (1+a+X¯)24(X¯+aY¯)>0. By the law of large numbers, as n, both X¯ and Y¯ converge to θ almost surely and (1+a+X¯)24(X¯+aY¯)(1+a+θ)24(1+a)θ=(1+aθ)2>0 almost surely. This shows that, with probability tending to 1 as n, the score equation has two real solutions, {1+a+X¯±(1+a+X¯)24(X¯+aY¯)}/2. The solution with + sign in front of the squared root is always larger than 1, out of the range (0,1) for θ in this problem. Hence, we conclude that the MLE of θ is θˆ=min{1, 1+a+X¯(1+a+X¯)24(X¯+aY¯)2}. The minimum is taken because 0<θ<1. Again, the MLE θˆ is a nonlinear function of the separate MLEs, X¯ and Y¯.

The asymptotic distribution of θˆ can be derived using the delta-method, but because regularity conditions (R1) –(R4) are satisfied, it is a corollary of Theorem 2.1 in Section 2.

Corollary 5.1

Under the Poisson and binary assumptions for two datasets and m = an, as n, n(θˆθ)dN(0, θ(1θ)1θ+a).

The asymptotic relative efficiency of the MLE X¯ based on the first dataset with respect to the MLE θˆ based on two datasets is (1θ)/(1θ+a), which is decreasing in a and bounded between 0 and 1. The asymptotic relative efficiency of the MLE Y¯ based on the second dataset with respect to the MLE θˆ based on two datasets is a/(1θ+a), which is increasing in a and bounded between 0 and 1.

6. MLEs with two samples and an additional uncertainty

In this section, we consider a scenario in which the first sample is obtained under a controlled study so that we know the form of probability density f(x,θ,ϕ), but the form of g(y,θ,φ) for the second sample has an additional uncertainty, because the second sample may be obtained through a past study and/or public records. We assume that the additional uncertainty comes from an unknown parameter ζ taking two possible values, 0 and 1, i.e., the probability density of the second sample is g(y,θ,φ,ζ), where ζ=0 or 1 and g is still a known density when θ, φ, and ζ are known.

How do we derive the MLE of ϑ=(θ,ϕ,φ)? If ζ is known, then the MLE can be obtained using the method in Section 2. Since ζ takes only two values, if ζˆ is a consistent estimator of ζ, i.e., (3) limnP(ζˆ=ζ)=1,(3) then we obtain the MLE of ϑ as ϑˆ={ϑˆ(0),ζˆ=0,ϑˆ(1),ζˆ=1, where ϑˆ(0) and ϑˆ(1) are MLEs under ζ=0 and ζ=1, respectively.

A suggested consistent estimator of ζ is the MLE of ζ based on the second sample, Yj's. Let θˆ(ζ) and φˆ(ζ) be the MLEs of θ and φ, respectively, based on Yj's, when the value of ζ is fixed. Then the MLE of ζ is ζˆ={0,j=1mg(Yj,θˆ(0),φˆ(0),0)j=1mg(Yj,θˆ(1),φˆ(1),1),1,j=1mg(Yj,θˆ(0),φˆ(0),0)<j=1mg(Yj,θˆ(1),φˆ(1),1). The following result gives the asymptotic distribution of the MLE ϑˆ.

Theorem 6.1

If (Equation3) holds and regularity conditions (R1) –(R4) are satisfied when ζ=0 or 1, and if m=an with a remaining fixed as n increases, then n(ϑˆϑ)dN(0, {I(ϑ,ζ)}1), where I(ϑ,ζ) is the Fisher information as defined in Section 2 under the true value of ζ.

Condition (Equation3) has to be checked for each particular problem. The following is an example.

Suppose that f(x,θ) is the density of N(0,θ2), g(y,θ,0) is the same normal density for N(0,θ2) but g(y,θ,1) is the Laplace distribution with zero mean and standard deviation 2θ given in Section 3.1. In other words, sample one is from the main study whereas sample two is from an external source in which the data may follow the same distribution as sample one but may deviate from sample one. The parameters ϕ and φ are constant (non-existing).

In this example, when ζˆ=0, we can simply combine the two samples and the MLE of θ is (i=1nXi2+j=1mYj2)/(n+m); on the other hand, when ζˆ=1, the MLE of θ is given by (Equation2). To check (Equation3), note that θˆ(0)=1mj=1mYj2andθˆ(1)=1mj=1m|Yj|. Then, log{j=1mg(θˆ(0),0)}=m2mlogθˆ(0)mlog(2π)2 and log{j=1mg(θˆ(1),1)}=mmlogθˆ(1)mlog2. When ζ=0, θˆ(0)pθ and θˆ(1)p(2/π)1/2θ, where p denotes convergence in probability as n. Hence 1mlog{j=1mg(θˆ(0),0)}1mlog{j=1mg(θˆ(1),1)}p12+log2π>0, which implies that P(ζˆ=0)1. On the other hand, when ζ=1, θˆ(0)p2θ, θˆ(1)pθ, and 1mlog{j=1mg(θˆ(0),0)}1mlog{j=1mg(θˆ(1),1)}p12logπ2<0, which implies that P(ζˆ=1)1. This shows that (Equation3) always holds in this example.

Still in this example, the results here and in Section 3.1 indicate that n(θˆθ)d{N(0,θ22a+2),ζ=0,N(0,θ2a+2),ζ=1. The result can obviously be extended to the situation where the second sample is from a population that is one of k populations with k3.

7. Bootstrap variance estimation

In situations where regularity conditions (R1) –(R4) are not satisfied for f or g, the asymptotic distribution of MLE ϑˆ may not be available, either it does not exist or it is not established. Here, we introduce a bootstrap variance estimator which can be used for assessing performance of ϑˆ or making large sample inference. A description about the general bootstrap methodology can be found, for example, in Efron and Tibshirani (Citation1993) and Shao (Citation2003).

Let {X1b,,Xnb} and {Y1b,,Ymb} be two independent simple random samples with replacement from {X1,,Xn} and {Y1,,Ym}, respectively, and let ϑˆb be the MLE of ϑ based on dataset {X1b,,Xnb,Y1b,,Ymb}. If we independently repeat this for b=1,,B, where B is called the bootstrap replication size and is typically large, then the bootstrap variance estimator for ϑˆ is the sample covariance matrix of ϑˆb, b=1,,B.

We carry out a simulation study to examine the performance of this bootstrap variance estimator in the normal-Laplace problem considered in Section 3.2. At the same time, we also check the performance of MLE (μˆ,θˆ) based on two datasets, Xi's and Yi's, and compare it with (X¯,θˆX) and (Y~,θˆY), which are the MLEs based on the single dataset of Xi's and single dataset of Yi's, respectively, where X¯= sample mean of Xi's, Y~ = sample median of Yj's, θˆX={i=1n(XiX¯)2/n}1/2, and θˆY=i=1m|YiY~|/m. The bootstrap is applied to obtain SDˆ for the standard deviation (SD) of any fixed point estimator.

The simulation results with 1000 replications are shown in Table . A summary is given as follows.

  1. The MLE's, μˆ, X¯, and Y~, all have almost no bias as estimators of µ (=1 in simulation). In terms of the SD, The MLE μˆ is the best among the three. The sample median based on Yj's is substantially worse than the other two, although asymptotically it is as efficient as the sample mean X¯ of Xi's.

  2. The MLE θˆ of θ does not have a negligible bias, although its performance is acceptable with sample size n + m = 200 and its SD is slightly smaller than the SD of θˆX. The large bias of the MLE θˆ mainly comes from the large bias of the MLE θˆY for the Laplace dataset, as it has large bias and SD.

  3. The bootstrap SD estimator SDˆ performs very well for all estimators (see the rows under “SD by simulation” and “mean of SDˆ by simulation” in Table ), even when the point estimator has non-negligible bias.

Table 1. Results from 1000 simulations for the normal-Laplace problem with location μ=1 and scale θ=1 (n = m = 100, SD = standard deviation, (μˆ,θˆ)= the MLE of (μ,θ) based on Xi's and Yi's, (X¯,θˆX)= the MLE of (μ,θ) based on Xi's, (Y~,θˆY)= the MLE of (μ,θ) based on Yj's, and SDˆ is by bootstrap with B = 500).

The histogram of 1000 values of μˆ from simulation is shown in Figure , together with a Q–Q plot. The result suggests μˆ is asymptotically normal, although such a theoretical result has not been established.

Figure 1. Histogram and Q–Q plot of 1000 simulated values of μˆ.

Figure 1. Histogram and Q–Q plot of 1000 simulated values of μˆ.

Acknowledgments

The authors would like to thank two anonymous referees for helpful comments and suggestions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Jun Shao's research was partially supported by the National Natural Science Foundation of China [Grant Number 11831008] and the U.S. National Science Foundation [Grant Number DMS-1914411].

References

  • Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman and Halll/CRC.
  • Kim, H. J., Wang, Z., & Kim, J. K (2021). Survey data integration for regression analysis using model calibration. arXiv 2107.06448.
  • Lohr, S. L., & Raghunathan, T. E. (2017). Combining survey data with other data sources. Statistical Science, 32(2), 293–312. https://doi.org/10.1214/16-STS584
  • Merkouris, T. (2004). Combining independent regression estimators from multiple surveys. Journal of the American Statistical Association, 99(468), 1131–1139. https://doi.org/10.1198/016214504000000601
  • Rao, J. N. K. (2021). On making valid inferences by integrating data from surveys and other sources. Sankhya B, 83(1), 242–272. https://doi.org/10.1007/s13571-020-00227-w
  • Shao, J. (2003). Mathematical statistics. 2nd ed. Springer.
  • Yang, S., & Kim, J. K. (2020). Statistical data integration in survey sampling: A review. Japanese Journal of Statistics and Data Science, 3(2), 625–650. https://doi.org/10.1007/s42081-020-00093-w
  • Zhang, Y., Ouyang, Z., & Zhao, H. (2017). A statistical framework for data integration through graphical models with application to cancer genomics. The Annals of Applied Statistics, 11(1), 161–184. https://doi.org/10.1214/16-AOAS998
  • Zieschang, K. D. (1990). Sample weighting methods and estimation of totals in the consumer expenditure survey. Journal of the American Statistical Association, 85(412), 986–1001. https://doi.org/10.1080/01621459.1990.10474969