869
Views
5
CrossRef citations to date
0
Altmetric
Theory and Methods

Extending the State-Space Model to Accommodate Missing Values in Responses and Covariates

, &
Pages 202-216 | Received 01 Aug 2011, Published online: 15 Mar 2013
 

Abstract

This article proposes an extended state-space model for accommodating multivariate panel data. The novel aspect of this contribution is an adjustment to the classical model for multiple subjects that allows missingness in the covariates in addition to the responses. Missing covariate data are handled by a second state-space model nested inside the first to represent unobserved exogenous information. Relevant Kalman filter equations are derived, and explicit expressions are provided for both the E- and M-steps of an expectation-maximization (EM) algorithm, to obtain maximum (Gaussian) likelihood estimates of all model parameters. In the presence of missing data, the resulting EM algorithm becomes computationally intractable, but a simplification of the M-step leads to a new procedure that is shown to be an expectation/conditional maximization (ECM) algorithm under exogeneity of the covariates. Simulation studies reveal that the approach appears to be relatively robust to moderate percentages of missing data, even with fewer subjects and time points, and that estimates are generally consistent with the asymptotics. The methodology is applied to a dataset from a published panel study of elderly patients with impaired respiratory function. Forecasted values thus obtained may serve as an “early-warning” mechanism for identifying patients whose lung function is nearing critical levels. Supplementary materials for this article are available online.

Acknowledgments

This research is based on Naranjo’s Ph.D. thesis. Also, it is supported by National Science Foundation grants DMS-0631632 and SES-0631588. The authors thank Dr. Susanna Lagorio, MD, Senior Researcher, National Centre for Epidemiology Surveillance and Health Promotion (CNESPS), National Institute of Health (Istituto Superiore di Sanità), Rome (Italy), for access to the Lagorio et al. () data. We are indebted to Prof. Dr. Miguel Jerez, Universidad Complutense de Madrid (Spain), for helpful discussions and access to E4, a MATLAB toolbox for time series modeling, which permitted us to carry out model identification calculations. We also acknowledge the suggestions of two anonymous reviewers that led to vast improvements. Finally, we dedicate this work to the memory of our mentor and colleague, George Casella, whose passing leaves an immense void in the statistics community.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.