3,370
Views
3
CrossRef citations to date
0
Altmetric
Research Article

Missing Data Analysis in Regression

ORCID Icon, , &
Article: 2032925 | Received 17 May 2021, Accepted 18 Jan 2022, Published online: 13 Feb 2022

References

  • Alcala-Fdez, J., A. Fernandez, J. Luengo, J. Derrac, S. Garcia, L. Sanchez, and F. Herrera. 2011. Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing 17 (2–3):255–2182.
  • Altman, N. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46 (3):175–85.
  • Andridge, R., and R. Little. 2010. A review of hot deck imputation for survey non-response. International Statistical Review 78 (1):40–64. doi:10.1111/j.17515823.2010.00103.x.
  • Bache, K., and Lichman. (2013). Uci machine learning repository. Available in http://archive.ics.uci.edu/ml.
  • Bishop, C. (2016). Pattern recognition and machine learning. DOI 10.1117/1.2819119.
  • Breiman, L. 2001. Random forests. 1–33. doi: 10.1017/CBO9781107415324.004.
  • Breiman, L., J. Friedman, R. Olshen, and C. Stone (1984). Classification and regression trees. CRC Press.
  • Buuren, S. 2018. Flexible imputation of missing data. doi:10.1201/b11826.
  • Chand, M., B. Bhattarai, N. Pradhananga, and P. Baral. 2021. Trend analysis of temperature5252 data for the Narayani river basin, Nepal. Science 1:38. doi:10.3390/sci1020038.
  • de Myttenaere, A., B. Golden, B. Le Grand, and F. Rossi. 2016. Mean absolute percentage error for regression models. Neurocomputing 192:38–48. doi:10.1016/j.neucom.2015.12.114.
  • Deng, L., and D. Yu. 2013. Deep learning: Methods and applications. Foundations and Trends R in Signal Processing 7 (3–4):197–387. doi:10.1561/2000000039.
  • Du, H., F. Liu, and L. Wang. 2017. A Bayesian fill-in method for correcting for publication bias in meta-analysis. Psychological Methods 22 (4):799–817. doi:10.1037/met0000164.
  • Freund, Y., and R. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. Computational Learning Theory 55:119–39.
  • Gelman, A., and J. Hill. 2007. Data analysis using regression and multilevel/hierarchical models. doi:10.2277/0521867061.
  • Haykin, S. 1998. Neural networks: A comprehensive foundation. 2nd ed. Upper Saddle River, NJ, USA: Prentice Hall PTR.
  • Howell, D. C. 2007. The treatment of missing data. The Sage Handbook of Social Science Methodology 1: 208–24.
  • Jadhav, A., D. Pramod, and K. Ramanathan. 2019. Comparison of performance of data imputation methods for numeric dataset. Applied Artificial Intelligence 33 (10):913–33. doi:10.1080/08839514.2019.1637138.
  • Kohavi, R., and G. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97 (1–2):273–324. doi:10.1016/S0004-3702(97)00043-X.
  • Li, H., Y. Weng, Y. Liao, B. Keel, Brown, and K. E. Brown. 2021. Distribution grid impedance and topology estimation with limited or no micro-pmus. International Journal of Electrical Power and Energy Systems 129:106794. doi:10.1016/j.ijepes.2021.106794.
  • Little, R., and D. Rubin. 2002. Statistical analysis with missing data. doi:10.2307/1533221.
  • Ludtke, O., A. Robitzsch, and S. Grund. 2017. Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods 22 (1):141–65. doi:10.1037/met0000096.
  • MacKay, D. 1992. Bayesian interpolation. Neural Computation 4 (3):415–47. doi:10.1162/neco.1992.4.3.415.
  • Madden, G., N. Apergis, P. Pappoport, and A. Banerjee. 2018. An application of nonparametric regression to missing data in large market surveys. Journal of Applied Statistics 45 (7):1292–302. doi:10.1080/02664763.2017.1369498.
  • Markovsky, I. 2009. Applications of structured low-rank approximation. IFAC Proceedings Volumes . 42(10): 1121–1126 https://doi.org/10.3182/20090706-3-FR-2004.00186.
  • Mazumder, R., T. Hastie, and R. Tibshirani. 2010. Spectral regularization algorithms for learning large incomplete matrices. Journal of Machine Learning Research 11:2287–322.
  • Ngueilbaye, A., H. Wanga, D. Ahmat Mahamat, and S. Junaidu. 2021. Modulo 9 model-based learning for missing data imputation. Applied Soft Computing Journal 103:107167. doi:10.1016/j.asoc.2021.107167.
  • Pedregosa, F., G. Varoquaux, A. Gramfort, and V. Michel. 2011. Scikit-learn: Machine learning in python. Journal of Machine Learning Research 12:2825–30.
  • Peng, L., and L. Lei. 2005. A review of missing data treatment methods. Intelligent Information Management Systems and Technology 1 (3):412–19.
  • Perez-Ruiz, L., and G. Escarela. 2018. Joint regression modeling for missing categorical covariates in generalized linear models. Journal of Applied Statistics 45:2741–59. doi:10.1080/02664763.2018.1438376.
  • Rubin, D. B. 1976. Inference and missing data. Biometrika 63 (3):581–92. doi:10.1093/biomet/63.3.581.
  • Saar-Tsechansky, M., and F. Provost. 2007. Handling missing values when applying classification models. Journal of Machine Learning Research 8:1625–57.
  • Santos, M. S., R. C. Pereira, A. F. Costa, J. P. Soares, J. Santos, and P. H. Abreu. 2019. Generating synthetic missing data: A review by missing mechanism. IEEE Access 7:11651–67. doi:10.1109/ACCESS.2019.2891360.
  • Schafer, J., and J. Graham. 2002. Missing data: Our view of the state of the art. Psychological Methods 7 (2):147–77. doi:10.1037/1082-989X.7.2.147.
  • Schouten, R. M., P. Lugtig, and G. Vink. 2018. Generating missing values for simulation purposes: A multivariate amputation procedure. Journal of Statistical Computation and Simulation 88 (15):2909–30. doi:10.1080/00949655.2018.1491577.
  • Shokouhyar, S., S. Ahmadi, and M. Ashrafzadeh. 2021. Promoting a novel method for warranty claim prediction based on social network data. Reliability Engineering & System Safety 216:108010. doi:10.1016/j.ress.2021.108010.
  • Steinwart, I., and A. Christmann. 2008. Support vector machines. 1st ed. Berlin: Springer Publishing Company, Incorporated.
  • Tachmazidis, I., T. Chen, M. Adamou, and G. Antoniou. 2021. A hybrid ai approach for supporting clinical diagnosis of attention deficit hyperactivity disorder (ADHD) in adults. Health Information Science and Systems 9 (1):1–8. doi:10.1007/s13755-020-00123-7.
  • van Buuren, S., and K. Groothuis-Oudshoorn. 2011. mice: Multivariate imputation by chained equations in r. Journal of Statistical Software 45 (3): 1–67. https://doi.org/10.18637/jss.v045.i03.
  • Zhang, Q., and L. Wang. 2017. Moderation analysis with missing data in the predictors. Psychological Methods 22 (4):649666. doi:10.1037/met0000104.