ABSTRACT
The biggest concern about diabetes-related complications is that they are unrecognised in the early stages but can be immutable and devastating with time. Identifying the population at high risk of developing such complications can help intervene in preventative care at an early stage. This study aims to present a data-driven approach to predict the patients at higher risk for diabetes-related complications using real-world data. We used comorbid diagnostic features from the electronic health records called “Cerner Health Facts EMR Data” to build machine learning-based prediction models for three diabetes-related long-term complications: (a) eye diseases, (b) kidney diseases, and (c) neuropathy. Our developed pipeline was able to generate highly accurate models for predictions. We deduced from the F1-scores that applying the class balancing techniques improved the overall performance of the models, and SVM with oversampling technique was the most consistent classifier for all three cohorts.
Disclosure statement
No potential conflict of interest was reported by the author(s).