Figures & data
Figure 1. Examples of structuralization of health record in text: (A) shows the text segmentation of the text records of a real case, while (B) shows the physical complaints extracted from this record data.
![Figure 1. Examples of structuralization of health record in text: (A) shows the text segmentation of the text records of a real case, while (B) shows the physical complaints extracted from this record data.](/cms/asset/f9a3be4d-8e2f-4054-b09e-b66f13244873/iann_a_2314237_f0001_c.jpg)
Table 1. List of the frequencies of the present symptoms (N = 158,988).
Table 2. The sensitivity and specificity of the model at different cutoff values.
Figure 2. Correlations between each pair of physical complaints in the study sample (N = 158,988). (Cells with dot indicate statistically significant correlation. Blue and red colour represent positive and negative correlation, respectively. The correlations were measured using Spearman’s rank correlation co-efficient, and the statistical significance was determined using Z-test.)
![Figure 2. Correlations between each pair of physical complaints in the study sample (N = 158,988). (Cells with dot indicate statistically significant correlation. Blue and red colour represent positive and negative correlation, respectively. The correlations were measured using Spearman’s rank correlation co-efficient, and the statistical significance was determined using Z-test.)](/cms/asset/e7c6e96d-deb8-4f30-9e94-405a0e9942d4/iann_a_2314237_f0002_c.jpg)
Figure 3. Patterns of physical complaints: (A) is the pattern of CHB patients; and (B) is the pattern of non-CHB people (symptoms in the same cluster usually present at the same time, and vice versa).
![Figure 3. Patterns of physical complaints: (A) is the pattern of CHB patients; and (B) is the pattern of non-CHB people (symptoms in the same cluster usually present at the same time, and vice versa).](/cms/asset/e47621d2-ec6b-484d-b650-9c44aa16c48c/iann_a_2314237_f0003_b.jpg)
Figure 5. Importance of the predictors in XGB model based on physical complaints and clinical parameters (X-axis is the SHAP value of different predictors. Predictors with higher SHAP value can provide more predictive power and are more important).
![Figure 5. Importance of the predictors in XGB model based on physical complaints and clinical parameters (X-axis is the SHAP value of different predictors. Predictors with higher SHAP value can provide more predictive power and are more important).](/cms/asset/1c01ae99-f1aa-4383-97da-0662bcd86520/iann_a_2314237_f0005_c.jpg)
Figure 6. Nonlinear effects of the top ten important predictors in the XGB model based on symptoms and clinical parameters.
![Figure 6. Nonlinear effects of the top ten important predictors in the XGB model based on symptoms and clinical parameters.](/cms/asset/e363c31d-7805-4fb1-8ea2-b17d91e1712a/iann_a_2314237_f0006_c.jpg)
Figure 7. ROC curves of different HBV detection models based on symptoms and common clinical parameters on test sample (n = 9,131).
![Figure 7. ROC curves of different HBV detection models based on symptoms and common clinical parameters on test sample (n = 9,131).](/cms/asset/81f91cc9-e58c-40b1-aa3d-f23f150a5ff9/iann_a_2314237_f0007_c.jpg)
Supplemental Material
Download Zip (8.3 MB)Supplemental Material
Download Zip (3.5 MB)Data availability statement
Data of this study are available from the corresponding author upon reasonable request.