Abstract
Health-science researchers often measure psychological constructs using multi-item scales and encounter missing items on some participants. Multiple imputation (MI) has emerged as an alternative to ad-hoc methods (e.g., mean substitution) for handling incomplete data on multi-item scales, appealingly reflecting available information while accounting for uncertainty due to missing values in a unified inferential framework. However, MI can be implemented in a variety of ways. When the number of variables to impute gets large, some strategies yield unstable estimates of quantities of interest while others are not technically feasible to implement. These considerations raise pragmatic questions about the extent to which ad-hoc procedures would yield statistical properties that are competitive with theoretically motivated methods. Drawing on an HIV study where depression and anxiety symptoms are measured with multi-item scales, this empirical investigation contrasts ad-hoc methods for handling missing items with various MI implementations that differ as to whether imputation is at the item-level or scale-level and how auxiliary variables are incorporated. While the findings are consistent with previous reports favoring item-level imputation when feasible to implement, we found only subtle differences in statistical properties across procedures, suggesting that weaknesses of ad-hoc procedures may be muted when missing data percentages are modest.
Acknowledgements
We would like to thank the study participants for their time commitment in participating in the Adolescent Trials Network (ATN) CARES study and acknowledge the ATN CARES Team members: Sue Ellen Abdalian, Elizabeth Mayfield Arnold, Robert Bolan, Yvonne Bryson, W. Scott Comulada, Ruth Cortado, M. Isabel Fernandez, Risa Flynn, Panteha Hayati Rezvan, Tara Kerin, Jeffrey Klausner, Marguerita Lightfoot, Norweeta Milburn, Karin Nielsen, Manuel Ocasio, Wilson Ramos, Cathy Reback, Mary Jane Rotheram-Borus, Dallas Swendeman, Wenze Tang, and Robert E. Weiss. We also thank the reviewers for improving the quality of this manuscript including a reviewer who added to our discussion linking FIML to MI for structural equation modeling.
Disclosure Statement
The authors have declared that they have no competing or potential conflicts of interest in relation to the work described.
Ethical Principles
The authors affirm having followed professional ethical guidelines in preparing this work. These guidelines include obtaining informed consent from human participants, maintaining ethical treatment and respect for the rights of human or animal participants, and ensuring the privacy of participants and their data, such as ensuring that individual participants cannot be identified in reported results or from publicly available original or archival data. The University of California Los Angeles (UCLA) Institutional Review Board approved the study (IRB #16-001674-AM-00006), and the trial was registered in www.Clinicaltrials.gov (#NCT03134833).
Role of the Funders/Sponsors
None of the funders or sponsors of this research had any role in the design and conduct of the study; collection, management, analysis, and interpretation of data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.
Notes
1 For comparative purposes, we applied the two-stage algorithm developed by von Hippel (Citation2018) which indicates the required number of imputations ensuring replicable SEs estimates if missing data were imputed again. The algorithm suggested 8 imputed datasets were required to estimate SEs of the covariates with the desired precision. The Monte Carlo SEs for the 30 estimated regression coefficients, which indicate variability of the estimates across repeated MI procedure, showed minor variation (Footnotes of Tables S9 – S14).