261
Views
0
CrossRef citations to date
0
Altmetric
ORIGINAL RESEARCH

Simple Method to Predict Insulin Resistance in Children Aged 6–12 Years by Using Machine Learning

ORCID Icon & ORCID Icon
Pages 2963-2975 | Received 08 Jul 2022, Accepted 13 Sep 2022, Published online: 27 Sep 2022
 

Abstract

Background

Due to the increasing insulin resistance (IR) in childhood, rates of diabetes and cardiovascular disease may rise in the future and seriously threaten the healthy development of children. Finding an easy way to predict IR in children can help pediatricians to identify these children in time and intervene appropriately, which is particularly important for practitioners in primary health care.

Patients and Methods

Seventeen features from 503 children 6–12 years old were collected. We defined IR by HOMA-IR greater than 3.0, thus classifying children with IR and those without IR. Data were preprocessed by multivariate imputation and oversampling to resolve missing values and data imbalances; then, recursive feature elimination was applied to further select features of interest, and 5 machine learning methods—namely, logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost)—were used for model training. We tested the trained models on an external test set containing information from 133 children, from which performance metrics were extracted and the optimal model was selected.

Results

After feature selection, the numbers of chosen features for the LR, SVM, RF, XGBoost, and CatBoost models were 6, 9, 10, 14, and 6, respectively. Among them, glucose, waist circumference, and age were chosen as predictors by most of the models. Finally, all 5 models achieved good performance on the external test set. Both XGBoost and CatBoost had the same AUC (0.85), which was highest among those of all models. Their accuracy, sensitivity, precision, and F1 scores were also close, but the specificity of XGBoost reached 0.79, which was significantly higher than that of CatBoost, so XGBoost was chosen as the optimal model.

Conclusion

The model developed herein has a good predictive ability for IR in children 6–12 years old and can be clinically applied to help pediatricians identify children with IR in a simple and inexpensive way.

Abbreviations

IR, insulin resistance; AUC, area under the curve; ROC, receiver operating characteristic; T2DM, type-2 diabetes; AI, artificial intelligence; ML, machine learning; HOMA-IR, homeostatic model assessment for insulin resistance; CHNS, China Health and Nutrition Survey; CCDC, Chinese Center for Disease Control and Prevention; NINH, North Carolina at Chapel Hill and the National Institute for Nutrition and Health; SMOTE, Synthetic Minority Oversampling Technique; RFE, recursive feature elimination; METS, metabolic syndrome; LR, logistic regression; SVM, support vector machine; RF, random forest; XGBoost, extreme gradient boosting; CatBoost, gradient boosting with categorical features support; BMI, body mass index; HGB, haemoglobin; WBC, leukocytes; RBC, erythrocytes; PLT, platelets; BMI, body mass index; SBP, systolic blood pressure; DBP, diastolic blood pressure.

Data Sharing Statement

The datasets analysed during the current study for the model construction are available in the CHNS repository, https://www.cpc.unc.edu/projects/china. The datasets analysed during the current study for model test are available from the corresponding author on reasonable request.

Ethics Approval and Consent to Participate

The study for model construction was an analysis of a third-party anonymized publicly available database with pre-existing institutional review board approval. The study for model test was performed according to the World Medical Association’s Declaration of Helsinki, and approved by the institutional review board and the ethics committee of the Beijing Jishuitan Hospital, Beijing, China (201808-03). When both participants and parents or legal guardians were agreed on the participation in the study, parents or legal guardians gave written informed consent. We confirm that all methods in this study were carried out in accordance with the relevant guidelines and regulations.

Acknowledgments

Thanks all the participants for their involvement in the study.

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Disclosure

The authors declare that they have no competing interests.

Additional information

Funding

This research was funded by The Special Fund of the Pediatric Medical Coordinated Development Center of Beijing Hospital Authority, No.XTZD20180401, which provide financial support in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.