Professors Tang and Ju deserve a warm congratulation for their great work on a review of statistical inference for nonignorable missing data problems. Although they called their review ‘a selective review’, it actually covers most of contemporary advances in the difficult problem of dealing with nonignorable missing data.
In Section 3.1 of Tang and Ju's review, they discussed the weighting approach in estimation with nonignorable nonresponse, which is one of the most popular and effective methods of handling nonresponse. The key in the weighting approach is the estimation of the unknown propensity, the probability of observing the value of a response variable conditional on the value of the response variable and associated covariate values, regardless of whether the response value is observed or not. Tang and Ju reviewed the early developments with parametric models on propensity (e.g., Lee & Tang, Citation2006; Qin, Leung, & Shao, Citation2002; Wang, Shao, & Kim, Citation2014) as well as the more recent advances on semiparametric propensity modelling (Kim & Yu, Citation2011; Shao & Wang, Citation2016). The purpose of this note is to add some results and discussions on semiparametric propensity estimation.
We start with some notation. Let be a univariate response variable of interest and
be the associated multivariate covariate for the ith sampled unit,
, where
is observed if
and is missing if
, and
is always observed. We assume that
,
, are independent and identically distributed. The propensity is defined to be the conditional probability
. Since
may be missing, this propensity or nonresponse mechanism is nonignorable, and it is ignorable if and only if
.
A parametric model may be imposed on the propensity, but results derived under parametric models may be sensitive to the violations of parametric models, and thus, it is desired to make weaker assumptions. The following semiparametric model is assumed in Kim and Yu (Citation2011),
(1) where γ is an unknown parameter and g is an unspecified (nonparametric) function. Note that, under assumption (Equation1
(1) ), nonresponse is ignorable if and only if
, in which case the ignorable propensity
is nonparametric. Thus, assumption (Equation1
(1) ) is better than any parametric assumption on propensity because if
, any parametric assumption on propensity is unnecessary for handling ignorable missing data.
An extension to (Equation1(1) ) is
(2) where
is a parametric function and γ is a possibly multi-dimensional unknown parameter.
As shown in Shao and Wang (Citation2016), under either (Equation1(1) ) or (Equation2
(2) ) the unknown g and γ are not identifiable. Some additional condition is needed to identify the unknown g and γ so that valid estimation and inference is possible. For example, Kim and Yu (Citation2011) assumed that γ is known or can be estimated externally. A more reasonable assumption is to assume that some components of
can be excluded from the right-hand side of (Equation2
(2) ). That is,
can be decomposed into
and
such that
(3) while
, the part of the covariate vector not in the right-hand side of (Equation3
(3) ), is still a useful covariate in the sense that the conditional distribution of
given
depends on
. This idea was developed in Wang et al. (Citation2014), Zhao and Shao (Citation2015), and Shao and Wang (Citation2016), where they named the covariate
to be a nonresponse instrument.
Following Tang and Ju's review and Shao and Wang (Citation2016), we can show that assumption (Equation3(3) ) implies that
(4) and
(5) where
is a vector function of
. Under suitable conditions on
, asymptotically valid estimators of g and γ can be obtained based on (Equation4
(4) ) and (Equation5
(5) ), using either the method of generalised moments (Shao & Wang, Citation2016) or the empirical likelihood method in Tang and Ju's review (Section 3.2).
Once g and γ are estimated, the weighting approach using the inverse of propensity in (Equation3(3) ) with g and γ replaced by their estimators can be applied to estimate parameters of interest in the distribution of
or the conditional distribution of
given
.
Now, consider changing assumption (Equation3(3) ) to
(6) where g is nonparametric,
is parametric, and both are between 0 and 1. Note that (Equation6
(6) ) is a multiplicity model considered by Zhao and Shao (Citation2017). Under (Equation6
(6) ),counterparts of (Equation4
(4) ) and (Equation5
(5) ) are, respectively,
(7) and
(8) Asymptotically valid estimators of g and γ can be obtained using (Equation7
(7) ) and (Equation8
(8) ) and similar techniques in Shao and Wang (Citation2016).
Alternatively, we may change (Equation3(3) ) to
(9) where g is nonparametric,
is parametric, and
is between 0 and 1. Note that the difference between model (Equation3
(3) ) and model (Equation9
(9) ) is that the former has a multiplicity effect of
and
on propensity, whereas the latter has an additive effect of
and
. Under (Equation9
(9) ), counter parts of (Equation4
(4) ) and (Equation5
(5) ) are, respectively,
(10) and
(11) Again, asymptotically valid estimators of g and γ can be obtained using (Equation10
(10) ) and (Equation11
(11) ) and similar techniques in Shao and Wang (Citation2016).
We conclude with the following question. What is a general assumption for which (Equation3(3) ), (Equation6
(6) ) and (Equation9
(9) ) are all special cases and results similar to (Equation4
(4) ) and (Equation5
(5) ) can be derived?
Consider
(12) where g is nonparametric,
is parametric, and
is a two-dimensional known function. Note that (Equation3
(3) ), (Equation6
(6) ) and (Equation9
(9) ) are all special cases of (Equation12
(12) ). The previous results (Equation4
(4) ), (Equation7
(7) ) and (Equation10
(10) ) are all derived based on
Define
(13) a function of
and
. If, for each fixed
, ψ is a strictly monotone function of
, then (Equation13
(13) ) defines a possibly implicit function of
and similar results may be derived.
Disclosure statement
No potential conflict of interest was reported by the author.
Additional information
Funding
References
- Kim, J. K., & Yu, C. L. (2011). A semiparametric estimation of mean functionals with nonignorable missing data. Journal of the American Statistical Association, 106, 157–165. doi: 10.1198/jasa.2011.tm10104
- Lee, S. Y., & Tang, N. S. (2006). Bayesian analysis of nonlinear structural equation models with nonignorable missing data. Psychometrika, 71, 541–564. doi: 10.1007/s11336-006-1177-1
- Qin, J., Leung, D., & Shao, J. (2002). Estimation with survey data under non-ignorable nonresponse or informative sampling. Journal of American Statistical Association, 97, 93–200. doi: 10.1198/016214502753479338
- Shao, J., & Wang, L. (2016). Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika, 103, 175–187. doi: 10.1093/biomet/asv071
- Wang, S., Shao, J., & Kim, J. (2014). An instrumental variable approach for identification and estimation with nonignorable nonresponse. Statistica Sinica, 24, 1097–1116.
- Zhao, J., & Shao, J. (2015). Semiparametric pseudo likelihoods in generalized linear models with nonignorable missing data. Journal of American Statistical Association, 110, 1577–1590. doi: 10.1080/01621459.2014.983234
- Zhao, J., & Shao, J. (2017). Approximate conditional likelihood for generalized linear models with general missing data mechanism. Journal of System Science and Complexity, 30, 139–153. doi: 10.1007/s11424-017-6188-3