Abstract
To evaluate the effects of multiple-imputation (MI) method for missing data in gene expression profiles with different datasets and percentages of missing values compared with 3 single-imputation (SI) methods. Based on 3 gene expression profiles datasets from human colon cancer, non-small cell lung cancer, and lymph cancer, different deletion rates and different imputation numbers of MI were compared. The imputation and clustering effects of different methods were evaluated using the NRMSE and the gene clustering accuracy (F value). The NRMSE of the 4 methods gradually increased as the percentage of missing values in the 3 datasets increased, whereas the F value gradually decreased. In all datasets with different percentage of missing values settings, the NRMSEs of MI was consistently lower than those of the 3 SI methods, whereas the F value of MI was highest. The NRMSEs of MI gradually decreased as the number of imputations increased and increased as the variability in the original datasets increased, and the datasets imputed by MI showed the best clustering results. The results showed that the application of MI develops and enriches imputation-model approaches and provides a solid foundation for subsequent establishment of imputation strategies for gene expression profiles with missing data.
Acknowledgments
We are grateful to the Editor, Associate Editor, and two referees for their constructive comments on the article.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Additional information
Funding
Notes on contributors
Wei Ye
Wei Ye is a Postgraduate Student of Biostatistics at the Department of Health Statistics, College of Preventive Medicine, Army Medical University, Chongqing, China.
Ling Zhang
Ling Zhang is a Senior laboratory Technician at the Department of Health Education , College of Preventive Medicine, Army Medical University, Chongqing, China.
Wenqing Zhang
Wenqing Zhang was a Undergraduate Student of Biostatistics at the Department of Health Statistics, College of Preventive Medicine, Army Medical University, Chongqing, China.
Xiaojiao Wu
Xiaojiao Wu was a Postgraduate Student of Biostatistics at the Department of Health Statistics, College of Preventive Medicine, Army Medical University, Chongqing, China.
Dong Yi
Dong Yi is a Professor at the Department of Health Statistics, College of Preventive Medicine, Army Medical University, Chongqing, China.
Yazhou Wu
Yazhou Wu is a Director and Professor at the Department of Health Statistics, College of Preventive Medicine, Army Medical University, Chongqing, China.