Abstract
Regression analysis is often challenged by the fact that some covariates are not completely observed. Among other approaches is a newly developed semiparametric maximum likelihood (SML) method that requires no parametric specification of the selection mechanism or the covariate distribution and that yields efficient inference, at least in some specific models. In this paper, we propose an EM algorithm for finding the SML estimate and for variance estimation. Simulation results suggest that the SML method performs reasonably well in moderate-sized samples. In contrast, the analogous parametric maximum likelihood method is subject to severe bias under model mis-specification, even in large samples.