343
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Intelligent type 2 diabetes risk prediction from administrative claim data

, , , , , ORCID Icon, & show all
Pages 243-257 | Published online: 21 Oct 2021
 

ABSTRACT

Type 2 diabetes is a chronic, costly disease and is a serious global population health problem. Yet, the disease is well manageable and preventable if there is an early warning. This study aims to apply supervised machine learning algorithms for developing predictive models for type 2 diabetes using administrative claim data. Following guidelines from the Elixhauser Comorbidity Index, 31 variables were considered. Five supervised machine learning algorithms were used for developing type 2 diabetes prediction models. Principal component analysis was applied to rank variables’ importance in predictive models. Random forest (RF) showed the highest accuracy (85.06%) among the algorithms, closely followed by the k-nearest neighbor (84.48%). The analysis further revealed RF as a high performing algorithm irrespective of data imbalance. As revealed by the principal component analysis, patient age is the most important predictor for type 2 diabetes, followed by a comorbid condition (i.e., solid tumor without metastasis). This study’s finding of RF as the best performing classifier is consistent with the promise of tree-based algorithms for public data in other works. Thus, the outcome can guide in designing automated surveillance of patients at risk of forming diabetes from administrative claim information and will be useful to health regulators and insurers.

List of abbreviations

Acknowledgments

The data used for this study have been provided by a large Australian health insurer in a de-identified form and have been used as per the approved MOU. The authors would like to acknowledge the organization for its support.

Disclosure statement

The authors of this article declare that they do not have any competing interests.

Availability of data and materials

The data used in this study can be obtained in an abstract format upon request.

Authors’ contributions

SU: Originator of the idea, data analysis and writing; TI: Data analysis and writing; MEH: Data analysis; EG: Writing; OAS: Writing; MAM: Writing; AAM: Data analysis and Writing; VV: Data analysis and Writing

Additional information

Funding

This study did not receive any funding.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,155.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.