Figures & data
Table 1. Checkup variables under study.
Figure 2. Machine learning model architecture: (a) Logistic regression; (b) naïve bayes; (c) support vector machine; (d) random forest; (e) extremely randomized tree (f) extreme gradient boosting; (g) light gradient boosting machine; (h) multilayer perceptron.
![Figure 2. Machine learning model architecture: (a) Logistic regression; (b) naïve bayes; (c) support vector machine; (d) random forest; (e) extremely randomized tree (f) extreme gradient boosting; (g) light gradient boosting machine; (h) multilayer perceptron.](/cms/asset/58280c18-2c82-4b34-9f1b-40d28224365b/uaai_a_2145644_f0002_oc.jpg)
Table 2. General characteristics of two datasets in the study design.
Figure 3. Odds ratio plot for statistically significant features. Plot (a) displays the probability of developing prediabetes from normal as one unit of each feature increases and Plot (b) displays the probability of developing diabetes from prediabetes as one unit of each feature increases.
![Figure 3. Odds ratio plot for statistically significant features. Plot (a) displays the probability of developing prediabetes from normal as one unit of each feature increases and Plot (b) displays the probability of developing diabetes from prediabetes as one unit of each feature increases.](/cms/asset/bee5dcdd-392a-4496-b657-f8c2358e12e7/uaai_a_2145644_f0003_oc.jpg)
Table 3. The top-10 ranked variables by permutation feature importance for each ML in two datasets.
Table 4. Variable ranking for all 8 models by permutation feature importance.
Table 5. The variable selected Boruta, SelectKbest, Lasso method.
Table 6. The performance measure of each classification algorithm.
Figure 5. Violin plot. (a) prediabetes progression (b) diabetes progression associated with FBG, HbA1c, Hct.
![Figure 5. Violin plot. (a) prediabetes progression (b) diabetes progression associated with FBG, HbA1c, Hct.](/cms/asset/07f63513-bbf0-41d1-bb58-68a83a4502da/uaai_a_2145644_f0005_oc.jpg)