Abstract
Intensive longitudinal designs involving repeated assessments of constructs often face the problems of nonignorable attrition and selected omission of responses on particular occasions. However, time series models, such as vector autoregressive (VAR) models, are often fit to these data without consideration of nonignorable missingness. We introduce a Bayesian model that simultaneously represents the over-time dependencies in multivariate, multiple-subject time series data via a VAR model, and possible ignorable and nonignorable missingness in the data. We provide software code for implementing this model with application to an empirical data set. Moreover, simulation results comparing the joint approach with two-step multiple imputation procedures are included to shed light on the relative strengths and weaknesses of these approaches in practical data analytic scenarios.
Notes
1 To start the procedure, all missing observations are filled in using random draws with replacement from all observed values. Then, for the first iteration, is drawn from the distribution
. Missing values in
,
, are then filled in by drawing values from
, the posterior predictive distribution of
conditioned on
, and
. Here, we use superscript
to denote data sets and parameter estimates from the
iteration, with
denoting the original data sets or initial starting values of the parameters. Similar procedure is subsequently performed to generate predicted values for
, only that imputed values for
from the previous step are used in the prediction process. The first iteration ends when missing observations are filled in for all variables. The procedure is repeated for
iterations to result in one set of data with imputed values. The whole procedure is repeated multiple times to generate multiple imputed data sets and correspondingly, multiple sets of parameter and standard error estimates for subsequent pooling (van Buuren, Citation2012).
2 The mixture models factor the full-data model as:
3 The version of the R package dynr we used in this study contained a small error in handling the uncertainty of the missing data, which resulted in slight overestimation of the process noise variances, and higher corresponding biases for these particular parameters under the Partial MI approach. Without this error, biases for all process noise-related parameters would likely be even lower under the Partial MI, but conclusions involving other parameters should remain the same.
4 The standard deviation of all point estimates on a parameter across MC runs.
5 Since neither the LD method nor the two-step partial MI method explicitly specify and estimate a missingness model, no missing data parameter estimates were available from these methods.