342
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Intelligent type 2 diabetes risk prediction from administrative claim data

, , , , , ORCID Icon, & show all

References

  • AIHW. Health expenditure Australia 2017–18. Vol. 2019. Canberra (ACT): Australian Institute of Health and Welfare; 2019.
  • AIHW. Disease expenditure in Australia. Vol. 2019. Canberra (ACT): Australian Institute of Health and Welfare; 2019.
  • Moini J. 2019 Epidemiology of Diabetes. Cambridge, United States: Elsevier; 2019 Mar 19 211.
  • Hossain ME, Khan A, Moni MA, and Uddin S. Use of electronic health data for disease prediction: a comprehensive literature review. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2021;18(2): 745–758 .
  • Khan A, Uddin S, Srinivasan U. 2018Comorbidity network for chronic disease: a novel approach to understand type 2 diabetes progression. International Journal of Medical Informatics. 115:1–9. doi:10.1016/j.ijmedinf.2018.04.001.
  • Nahar J, Imam T, Tickle KS, Chen YPP. Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications. 2013;40(4):1086–93. doi:10.1016/j.eswa.2012.08.028.
  • Lu H, Uddin S, Hajati F, Moni MA, and Khushi M. A patient network-based machine learning model for disease prediction: the case of type 2 diabetes mellitus. Applied Intelligence. 2021;1–12. doi:10.1007/s10489-021-02533-w.
  • Soni J, Ansari U, Sharma D, Soni S. Predictive data mining for medical diagnosis: an overview of heart disease prediction. International Journal of Computer Applications. 2011;17(8):43–48. doi:10.5120/2237-2860.
  • Aneja S, and Lal S, editors. Effective asthma disease prediction using naive Bayes—neural network fusion technique. International Conference on Parallel, Distributed and Grid Computing (PDGC). Solan, India: IEEE; 2014.
  • Ani R, Sasi G, Sankar UR, and Deepa O, editors. Decision support system for diagnosis and prediction of chronic renal failure using random subspace classification. Advances in Computing, Communications and Informatics (ICACCI), 2016 International Conference. Jaipur, India: IEEE; 2016.
  • Malik S, Khadgawat R, Anand S, Gupta S. Non-invasive detection of fasting blood glucose level via electrochemical measurement of saliva. SpringerPlus. 2016;5(1):701. doi:10.1186/s40064-016-2339-6.
  • Mani S, Chen Y, Elasy T, Clayton W, and Denny J, editors. Type 2 diabetes risk forecasting from EMR data using machine learning. AMIA annual symposium proceedings. Chicago, United States: American Medical Informatics Association; 2012.
  • Lee J-G, Jun S, Cho Y-W, Lee H, Kim GB, Seo JB, Kim N. Deep learning in medical imaging: general overview. Korean Journal of Radiology. 2017;18(4):570–84. doi:10.3348/kjr.2017.18.4.570.
  • Khan A, Uddin S, Srinivasan U. Chronic disease prediction using administrative data and graph theory: the case of type 2 diabetes. Expert Systems with Applications. 2019;136:230–41. doi:10.1016/j.eswa.2019.05.048.
  • Akter T, Satu MS, Khan MI, Ali MH, Uddin S, Lió P, Quinn JMW, Moni MA. Machine learning-based models for early stage detection of autism spectrum disorders. IEEE Access. 2019;7(1):166509–27. doi:10.1109/ACCESS.2019.2952609.
  • Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences. 2000;97(1):262–67. doi:10.1073/pnas.97.1.262.
  • Hosseinzadeh A, Izadi M, Verma A, Precup D, Buckeridge D, editors. Assessing the predictability of hospital readmission using machine learning2013. 2013.
  • Kirlidog M, Asuk C. A fraud detection approach with data mining in health insurance. Procedia - Social and Behavioral Sciences. 2012;62:989–94. doi:10.1016/j.sbspro.2012.09.168.
  • Ortega PA, Figueroa CJ, Ruz GA. A medical claim fraud/abuse detection system based on data mining: a case study in chile. DMIN. 2006;6:26–29.
  • Sheshasaayee A, Thomas SS, editors. A purview of the impact of supervised learning methodologies on health insurance fraud detection2018. Singapore: Springer; 2018.
  • Choi SB, Lee W, Yoon J-H, Won J-U, Kim DW. Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea. Journal of Affective Disorders. 2018;231:8–14. doi:10.1016/j.jad.2018.01.019.
  • Kose I, Gokturk M, Kilic K. An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Applied Soft Computing. 2015;36:283–99. doi:10.1016/j.asoc.2015.07.018.
  • Kähm K, Stark R, Laxy M, Schneider U, and Leidl R. Assessment of excess medical costs for persons with type 2 diabetes according to age groups: an analysis of German health insurance claims data. Diabetic Medicine. 2020;37(10): 1752–1758.
  • Flory J, Gerhard T, Stempniewicz N, Keating S, Rowan CG. Comparative adherence to diabetes drugs: an analysis of electronic health records and claims data. Diabetes, Obesity and Metabolism. 2017;19(8):1184–87. doi:10.1111/dom.12931.
  • Deo R, and Panigrahi S (), editors. Performance assessment of machine learning based models for diabetes prediction. 2019 IEEE Healthcare Innovations and Point of Care Technologies (HI-POCT); November 2019; Bethesda, USA: 2019.
  • Lee CH, Yoon H-J. Medical big data: promise and challenges. Kidney Research and Clinical Practice. 2017;36(1):3. doi:10.23876/j.krcp.2017.36.1.3.
  • Roberts RF, Innes KC, Walker SM. Introducing ICD-10-AM in Australian hospitals. The Medical Journal of Australia. 1998;169:S32–5.
  • Albrecht JS, Hanna M, Kim D, Perfetto EM. Predicting diagnosis of Alzheimer’s disease and related dementias using administrative claims. Journal of Managed Care & Specialty Pharmacy. 2018;24(11):1138–45. doi:10.18553/jmcp.2018.24.11.1138.
  • Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Medical Care. 1998;36(1):8–27. doi:10.1097/00005650-199801000-00004.
  • Austin SR, Wong Y-N, Uzzo RG, Beck JR, Egleston BL. Why summary comorbidity measures such as the Charlson comorbidity index and elixhauser score work. Medical Care. 2015;53(9):e65–e72. doi:10.1097/MLR.0b013e318297429c.
  • Chang H-J, Chen P-C, Yang -C-C, Su Y-C, Lee -C-C. Comparison of elixhauser and Charlson methods for predicting oral cancer survival. Medicine. 2016;95(7):e2861. doi:10.1097/MD.0000000000002861.
  • Menendez ME, Neuhaus V, van Dijk CN, Ring D. The elixhauser comorbidity method outperforms the Charlson index in predicting inpatient death after orthopaedic surgery. Clinical Orthopaedics and Related Research. 2014;472(9):2878–86. doi:10.1007/s11999-014-3686-7.
  • Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi J-C, Saunders LD, Beck CA, Feasby TE, Ghali WA, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical Care. 2005;43(11):1130–39. doi:10.1097/01.mlr.0000182534.19832.83.
  • Magnin B, Mesrob L, Kinkingnéhun S, Pélégrini-Issac M, Colliot O, Sarazin M, Dubois B, Lehéricy S, Benali H. Support vector machine-based classification of Alzheimer’s disease from whole-brain anatomical MRI. Neuroradiology. 2009;51(2):73–83. doi:10.1007/s00234-008-0463-x.
  • Orru G, Pettersson-Yeo W, Marquand AF, Sartori G, Mechelli A. Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review. Neuroscience & Biobehavioral Reviews. 2012;36(4):1140–52. doi:10.1016/j.neubiorev.2012.01.004.
  • Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. 2019;19(1):1–16. doi:10.1186/s12911-019-1004-8.
  • Pan A, Wang Y, Talaei M, Hu FB, Wu T. Relation of active, passive, and quitting smoking with incident type 2 diabetes: a systematic review and meta-analysis. The Lancet Diabetes & Endocrinology. 2015;3(12):958–67. doi:10.1016/S2213-8587(15)00316-2.
  • Bergmeir C, Hyndman RJ, Koo B. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics & Data Analysis. 2018;120:70–83. doi:10.1016/j.csda.2017.11.003.
  • Krstajic D, Buturovic LJ, Leahy DE, Thomas S. Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of Cheminformatics. 2014;6(1):10. doi:10.1186/1758-2946-6-10.
  • Vehtari A, Gelman A, Gabry J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 2017;27(5):1413–32. doi:10.1007/s11222-016-9696-4.
  • dos Santos P, Araujo I, Sousa R, and Silva A, editors. Quantum enhanced k-fold cross-validation. 2018 7th Brazilian Conference on Intelligent Systems (BRACIS); Sao Paulo, Brazil: IEEE; 2018.
  • Probst P, Bischl B, Boulesteix A-L. Tunability: importance of hyperparameters of machine learning algorithms. arXiv Preprint. 2018;arXiv:180209596.
  • Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A Hyperband: a novel bandit-based approach to hyperparameter optimization. arXiv preprint. arXiv:160306560. 2016.
  • Bergstra JaB Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research. 2012;13:281–305.
  • Mania H, Guy A, Recht B. Simple random search provides a competitive approach to reinforcement learning. arXiv Preprint. 2018;arXiv:180307055.
  • Ting KM Confusion matrix. encyclopedia of machine learning and data mining. 260. 2017.
  • Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006;27(8):861–74. doi:10.1016/j.patrec.2005.10.010.
  • Witten IH, Frank E Data mining: practical machine learning tools and techniques: Morgan Kaufmann. 2005.
  • Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2016;374(2065):20150202. doi:10.1098/rsta.2015.0202.
  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel, M, Prettenhofer, P, Weiss, R, and Dubourg, V. Scikit-learn: machine learning in Python. Journal of Machine Learning Research. 2011 Oct;12:2825–30.
  • Krawczyk B. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence. 2016;5(4):221–32. doi:10.1007/s13748-016-0094-0.
  • Chawla NV. Data mining for imbalanced datasets: an overview. In: Maimon, O, and Rokach, L, eds. Data mining and knowledge discovery handbook. Springer; 2009:875–86.
  • Chawla NV. Data mining for imbalanced datasets: an overview. In: Maimon O, Rokach L, editors. Data mining and knowledge discovery handbook. Boston (MA): Springer US; 2005. p. 853–67.
  • Wild S, Roglic G, Green A, Sicree R, King H. Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care. 2004;27(5):1047–53. doi:10.2337/diacare.27.5.1047.
  • Control CfD. Prevention. national diabetes statistics report: estimates of diabetes and its burden in the United States, 2014. Vol. 2014. Atlanta (GA): US Department of Health and Human Services; 2014.
  • Cederholm J, Eeg-Olofsson K, Eliasson B, Zethelius B, Nilsson PM, Gudbjörnsdottir S. Risk prediction of cardiovascular disease in type 2 diabetes: a risk equation from the Swedish national diabetes register. Diabetes Care. 2008;31(10):2038–43. doi:10.2337/dc08-0662.
  • Hossain ME, Khan A, Uddin S, editors. Understanding the progression of congestive heart failure of type 2 diabetes patient using disease network and hospital claim data. International Conference on Complex Networks and Their Applications. Springer; 2019.
  • Zhou B, Lu Y, Hajifathalian K, Bentham J, Di Cesare M, Danaei G, Bixby, H, Cowan, MJ, Ali, MK, and Taddei, C. Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4· 4 million participants. The Lancet. 2016;387(10027):1513–30.
  • Xuan P, Pan S, Zhang T, Liu Y, Sun H. Graph convolutional network and convolutional neural network based method for predicting lncRNA-disease associations. Cells. 2019;8:9.
  • WHO Mortality Database [online database]. Geneva: world health organization. 2020. [Accessed 10 January 2020]. https://apps.who.int/iris/bitstream/handle/10665/204871/9789241565257_eng.pdf;jsessionid=2E1E99925345BDA39580E4FA34043B60?sequence=1
  • Patil BM, Joshi RC, Toshniwal D. Hybrid prediction model for Type-2 diabetic patients. Expert Systems with Applications. 2010;37(12):8102–08. doi:10.1016/j.eswa.2010.05.078.
  • Kaur G, Improved CA. J48 Classification Algorithm for the Prediction of Diabetes. International Journal of Computer Applications. 2014;98(22):13–17. doi:10.5120/17314-7433.
  • Kandhasamy JP, Balamurali S. Performance analysis of classifier models to predict diabetes mellitus. Procedia Computer Science. 2015;47:45–51. doi:10.1016/j.procs.2015.03.182.
  • Kahn R, Alperin P, Eddy D, Borch-Johnsen K, Buse J, Feigelman J, Gregg E, Holman RR, Kirkman MS, Stern M, et al. Age at initiation and frequency of screening to detect type 2 diabetes: a cost-effectiveness analysis. The Lancet. 2010;375(9723):1365–74. doi:10.1016/S0140-6736(09)62162-0.
  • Halpern A, Mancini MC, MEC M, Fisberg M, Radominski R, Bertolami MC, Bertolami A, de Melo ME, Zanella MT, Queiroz MS, et al. Metabolic syndrome, dyslipidemia, hypertension and type 2 diabetes in youth: from diagnosis to treatment. Diabetology & Metabolic Syndrome. 2010;2(1):55. doi:10.1186/1758-5996-2-55.
  • Rana JS, Mittleman MA, Sheikh J, Hu FB, Manson JE, Colditz GA, Speizer FE, Barr RG, Camargo CA. Chronic obstructive pulmonary disease, asthma, and risk of type 2 diabetes in women. Diabetes Care. 2004;27(10):2478–84. doi:10.2337/diacare.27.10.2478.
  • Willi C, Bodenmann P, Ghali WA, Faris PD, Cornuz J. Active smoking and the risk of type 2 diabetes: a systematic review and meta-analysis. Jama. 2007;298(22):2654–64. doi:10.1001/jama.298.22.2654.
  • Diabetes Australia. Available from: https://www.diabetesaustralia.com.au/risk-calculator
  • Bommer C, Sagalova V, Heesemann E, Manne-Goehler J, Atun R, Bärnighausen T, Davies J, Vollmer S. Global economic burden of diabetes in adults: projections from 2015 to 2030. Diabetes Care. 2018;41(5):963–70. doi:10.2337/dc17-1962.
  • Dall TM, Yang W, Gillespie K, Mocarski M, Byrne E, Cintina I, Beronja K, Semilla AP, Iacobucci W, Hogan PF, et al. The economic burden of elevated blood glucose levels in 2017: diagnosed and undiagnosed diabetes, gestational diabetes mellitus, and prediabetes. Diabetes Care. 2019;42(9):1661–68. doi:10.2337/dc18-1226.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.