1,735
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Machine learning algorithm to predict the in-hospital mortality in critically ill patients with chronic kidney disease

, , ORCID Icon, , , & show all
Article: 2212790 | Received 25 Oct 2022, Accepted 05 May 2023, Published online: 19 May 2023

Abstract

Background

This study aimed to establish and validate a machine learning (ML) model for predicting in-hospital mortality in critically ill patients with chronic kidney disease (CKD).

Methods

This study collected data on CKD patients from 2008 to 2019 using the Medical Information Mart for Intensive Care IV. Six ML approaches were used to build the model. Accuracy and area under the curve (AUC) were used to choose the best model. In addition, the best model was interpreted using SHapley Additive exPlanations (SHAP) values.

Results

There were 8527 CKD patients eligible for participation; the median age was 75.1 (interquartile range: 65.0–83.5) years, and 61.7% (5259/8527) were male. We developed six ML models with clinical variables as input factors. Among the six models developed, the eXtreme Gradient Boosting (XGBoost) model had the highest AUC, at 0.860. According to the SHAP values, the sequential organ failure assessment score, urine output, respiratory rate, and simplified acute physiology score II were the four most influential variables in the XGBoost model.

Conclusions

In conclusion, we successfully developed and validated ML models for predicting mortality in critically ill patients with CKD. Among all ML models, the XGBoost model is the most effective ML model that can help clinicians accurately manage and implement early interventions, which may reduce mortality in critically ill CKD patients with a high risk of death.

Background

Chronic kidney disease (CKD) has become a severe global health concern, with roughly 700 million individuals currently suffering from CKD [Citation1]. CKD is one of the world’s top 10 leading causes of mortality, impacting around 15% of the adult population [Citation2]. As the intensive care unit (ICU) population shifts, many patients have preexisting CKD [Citation3,Citation4]. In recent years, despite more medical resources devoted to treating CKD, the life expectancy of CKD patients remains much lower than that of the general population [Citation5]. According to one study, the risk of death is three times greater for patients with CKD in the ICU than those without CKD [Citation6]. Early identification of CKD patients at high risk for clinical deterioration is of great importance and may help to deliver proper care and optimize the use of limited resources [Citation7]. Thus, the clinical practice may benefit from developing predictive models that can accurately predict an individual’s survival prognosis.

Machine learning (ML) algorithms may offer an opportunity to reduce the risk of death from CKD in critically ill patients through their ability to analyze the vast amount of data in electronic health records. These data may include patient diagnoses, demographics, routinely obtained measures, and therapies. These cutting-edge data-driven methods can handle data with a high dimension, analyze complex relationships, and isolate essential predictors of outcomes. They are more flexible than traditional modeling techniques, which require predictors to be independent of each other and use variables selected primarily based on the statistical significance or clinical importance [Citation8,Citation9]. In recent years, ML methods have been widely used in the prognostic assessment of diseases [Citation10–12]. Clinicians may better screen for and identify patients at high risk of adverse outcomes with a well-built prediction model, allowing for more prompt intervention and better outcomes. Unfortunately, no ML model can predict in-hospital mortality among critically ill CKD patients. This research aimed to establish and validate an ML model for predicting in-hospital mortality in critically ill patients with CKD.

Methods

Database introduction

The Medical Information Mart for Intensive Care IV (MIMIC IV) database is a comprehensive, anonymized clinical dataset approved by the Massachusetts Institute of Technology [Citation13]. The MIMIC IV database contains data on all Beth Israel Deaconess Medical Center ICU patients between 2008 and 2019. As all patients in the database are anonymous and have no impact on clinical decision-making, the requirement for patient consent and ethically informed consent declarations was waived [Citation14]. One author (XL) passed the Protecting Human Research Participants exam of the National Institutes of Health (record ID: 35970146) and gained permissible access to the MIMIC IV database.

Study population

This research comprised all patients diagnosed with CKD who were enrolled in MIMIC IV. The diagnosis of CKD was based on the International Classification of Diseases, Ninth Revision (ICD-9) codes (5851, 5852, 5853, 5854, 5855, 5856, 5859), and International Classification of Diseases, Tenth Revision (ICD-10) codes (N18, N181, N182, N183, N184, N185, N186, and N189), which were recorded by hospital staff at the time of patient discharge. Patients admitted to the ICU more than once had their first admission counted. We eliminated patients under 18 and those who spent less than 24 h in the ICU.

Data collection

This study identified candidate variables for the model based on clinical expertise and previous studies [Citation1]. We used Navicat Premium to extract the demographic and clinical data from the MIMIC IV database. The research gathered age, gender, weight, ethnicity, and admission type as demographic factors. Medical conditions included congestive heart failure, peptic ulcer disease, myocardial infarction, peripheral vascular disease, diabetes, dementia, chronic pulmonary disease, rheumatic disease, cerebrovascular disease, cancer, paraplegia, liver disease, and acquired immune deficiency syndrome. Vital signs data, including heart rate, mean arterial pressure, respiratory rate, temperature, and oxygen saturation are averaged over the first 24 h after admission to the ICU. Laboratory results included hematocrit, hemoglobin, platelets, white blood cell, blood urea nitrogen, anion gap, international normalized ratio, serum creatinine, serum glucose, serum calcium, serum chloride, bicarbonate, serum potassium, serum sodium, partial thromboplastin time, and prothrombin time, all of which were maximum values within 24 h of admission to the ICU. The urine volume is recorded as the total value of the first 24 h after admission to the ICU. In addition, we recorded medical treatments such as renal replacement therapy, vasopressor use, and mechanical ventilation during the first 24 h following ICU admission. During the first 24 h following ICU admission, we determined the sequential organ failure assessment (SOFA) score and the first value of the simplified acute physiology score II (SAPS II) to use as the severity scores of illness. We also collected the patients’ CKD stage and estimated glomerular filtration rate (eGFR). Comorbidities for this study were defined according to the ICD-9 codes and ICD-10 codes [Citation15].

Endpoints

The endpoint of this study was in-hospital mortality.

Preprocessing of data

There were less than 20% missing values for any variable in this study (Supplementary Table S1). The multiple interpolation methods are better for dealing with missing data below 20%. The multiple interpolation methods allow for the creation of multiple reasonable, fully interpolated datasets, which are first analyzed individually, and then their results are combined into a single result. We created 20 fully interpolated datasets using Python’s ‘micforest’ package and then pooled with Rubin’s rules. The estimated parameters were then pooled with Rubin’s rules [Citation16].

Statistical analysis

Continuous variables in this study were expressed as the median and interquartile range (IQR), and the Mann–Whitney test was used to determine differences between groups due to their non-normal distribution. Categorical variables were expressed as numbers and percentages, and group comparisons were made using the Chi-square test or Fisher’s exact test, as appropriate.

Statistical analyses were performed using R software (version 4.2.1) (R Foundation for Statistical Computing, Vienna, Austria) and Python (version 3.9.12). A p value <.05 was considered to be statistically significant.

Machine learning

This study performed a hierarchical fivefold cross-validation to obtain the training set and validation sets. The study population was randomly divided into five subsets. Four of these subsets (80%) were combined as the training set, while the remaining (20%) were made the validation set, and this process was repeated five times for each outcome.

This study uses six ML techniques, including logistic regression, support vector machine (SVM), k-nearest neighbor (KNN), decision tree, random forest (RF), and eXtreme Gradient Boosting (XGBoost), to develop and validate models for the risk of death in critically ill patients with CKD. Logistic regression is a classification model. We have chosen the dichotomy logistic regression vs. ML because the logistic regression does not require the optimization of any hyperparameter and is thus easier to implement. SVM is a binary linear classifier. SVM separates different classes by establishing a decision boundary between two classes and optimizing the hyperplane distance between the boundary points, which can be obtained with reasonable accuracy from small data sets to achieve labeled prediction of one or more feature vectors. KNN is one of the most basic and simple ML algorithms. It can be used for both classification and regression. KNN performs classification by measuring the distances between different feature values. The decision tree is a single base classifier consisting of nodes and edges. Starting from the root node, also known as the first split point, the split determines the divisions of the entire dataset based on calculation. The process continues from top to bottom until no more partitioning is required, and the leaves present at the end of the decision tree represent the last partitions. RF is an ensemble learning method to overcome the drawbacks of a single base prediction model, aiming to achieve higher accuracy. This model includes multiple decision trees corresponding to various sub-datasets created from an identical dataset. XGBoost establishes K regression trees to make the predicted value of the tree group close to the real value as much as possible and can generalize as much as possible. The objective function of XGBoost requires the prediction error to be as small as possible, the number of leaf nodes to be as small as possible, and the number of nodes to be as low as possible.

We let each ML algorithm’s default hyper-parameters take effect to get started with a model. Afterward, we fine-tuned the parameters by searching the grid by hand. Tenfold cross-validation was used to find the optimal settings of the hyperparameters. The predictive performance of ML models was evaluated using accuracy, area under the curve (AUC), sensitivity, and specificity. For AUC, the 95% CI was computed with 2000 stratified bootstrap replicates. The testing AUC values corresponding to the different models were compared using paired Delong’s test. Accuracy and AUC were used to choose the best model. The probability of the best-performing model is evaluated using Brier scores and plotting the calibration curve for each model. To compare the predictive power between models, we calculated the Brier score for each model and plotted the calibration curve. Net reclassification improvement (NRI) was used to assess the correct reassignment between risk categories. In addition, the best model was interpreted using SHapley Additive exPlanations (SHAP) values and Local Interpretable Model-Agnostic Explanations (LIME) algorithm. Finally, a sensitivity analysis of the results was performed.

Class imbalance

The in-hospital mortality rate of CKD patients in this study was 16.5%. As the performance of the ML model may be affected by class imbalance, we performed a complementary analysis using an up-sampling approach.

Results

Participants

A total of 16,751 individuals were found to have CKD and be eligible to participate; however, 6314 were disqualified for non-first ICU admissions, and 3167 were disqualified due to having an ICU stay of fewer than 24 h. In the end, 8527 patients were eligible for the study (). Among ICU-admitted CKD patients, the in-hospital death rate was 16.5% (1406/8527). The median age of these patients was 75.1 (IQR: 65.0–83.5) years, and 61.7% (5259/8527) were male. Congestive heart failure (4543/8527, 53.3%), diabetes mellitus (4222/8527, 49.5%), and sepsis (3516/8527, 41.2%) were the top three comorbidities. provides a summary of the basic characteristics of the data set.

Figure 1. The flowchart of patient selection. MIMIC IV: Medical Information Mort for Intensive Care IV; ICU: intensive care unit; CKD: chronic kidney disease; LIME: Local Interpretable Model-Agnostic Explanations.

Figure 1. The flowchart of patient selection. MIMIC IV: Medical Information Mort for Intensive Care IV; ICU: intensive care unit; CKD: chronic kidney disease; LIME: Local Interpretable Model-Agnostic Explanations.

Table 1. Demographic and clinical characteristics at baseline.

Model development and validation

We developed six ML models using clinical variables as input factors, including logistic regression, SVM, KNN, decision tree, RF, and XGBoost. Compared to other ML models, the XGBoost model performed the best with an AUC of 0.860 (logistic regression: 0.841; SVM: 0.751; KNN: 0.641; decision tree: 0.601; RF: 0.834) (). The AUC in the XGBoost model was higher than in the other five models (p < .001) (Supplementary Table S2). Similarly, the XGBoost model outperformed various clinical disease severity scores (SOFA score (AUC): 0.762; SAPS II (AUC): 0.768) (). displays the performance of our further analysis of the performance of these six ML models in terms of their precision, sensitivity, specificity, brier score, and NRI. Calibration plots for the six ML models are shown in Supplementary Figure S1.

Figure 2. ROC curves for the ML models and the traditional severity of illness scores to predict in-hospital mortality. (A) ROC curves for the six ML models used to predict in-hospital mortality; (B) ROC curves for the traditional severity of disease scores used to predict in-hospital mortality. ROC: receiver operating characteristic; SVM: support vector machine; KNN; k-nearest neighbors; AUC: area under the curve; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II.

Figure 2. ROC curves for the ML models and the traditional severity of illness scores to predict in-hospital mortality. (A) ROC curves for the six ML models used to predict in-hospital mortality; (B) ROC curves for the traditional severity of disease scores used to predict in-hospital mortality. ROC: receiver operating characteristic; SVM: support vector machine; KNN; k-nearest neighbors; AUC: area under the curve; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II.

Table 2. Performance comparison of the machine learning models in the testing set.

Model explainability

By using SHAP values, we aimed to elucidate the mortality prediction process of the XGBoost model. depicts the feature importance ranking of the XGBoost model with SHAP summary plots, where SOFA score, SAPS II, respiratory rate, and urine output are the four factors that contribute most to the model. In addition, we used SHAP dependence analysis to illustrate the effect of a single input variable on the final results of the XGBoost prediction model (). shows the results of a more in-depth analysis of the four most influential clinical characteristics of the XGBoost prediction model output. In addition, we used the LIME algorithm to explain the individualized prediction of death by taking two samples (one survival and one deceased) from the validation set (Supplementary Figure S2).

Figure 3. The top 20 important features derived from the XGBoost model. SHAP indicates the importance of ranking features. Each line represents a feature, and the abscissa is the SHAP value. The matrix plot represents the significance of each covariate in constructing the final predictive model. The higher the SHAP value for each clinical variable, the higher risk of death. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II; BUN: blood urea nitrogen; SpO2: oxygen saturation; MAP: mean arterial pressure; PTT: partial thromboplastin time.

Figure 3. The top 20 important features derived from the XGBoost model. SHAP indicates the importance of ranking features. Each line represents a feature, and the abscissa is the SHAP value. The matrix plot represents the significance of each covariate in constructing the final predictive model. The higher the SHAP value for each clinical variable, the higher risk of death. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II; BUN: blood urea nitrogen; SpO2: oxygen saturation; MAP: mean arterial pressure; PTT: partial thromboplastin time.

Figure 4. SHAP summary plot of the top 20 features of the XGBoost model. The importance matrix plot of clinical variables is derived using the XGBoost model. The matrix plot ranks the importance of the variables, revealing the contribution of each variable to death vs. survive. The greater the SHAP value of a characteristic, the greater the likelihood of death development. The abscissa represents the SHAP value, and each line represents a feature. Red dots indicate greater feature values, whereas blue dots indicate lower feature values. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II; BUN: blood urea nitrogen; SpO2: oxygen saturation; MAP: mean arterial pressure; PTT: partial thromboplastin time.

Figure 4. SHAP summary plot of the top 20 features of the XGBoost model. The importance matrix plot of clinical variables is derived using the XGBoost model. The matrix plot ranks the importance of the variables, revealing the contribution of each variable to death vs. survive. The greater the SHAP value of a characteristic, the greater the likelihood of death development. The abscissa represents the SHAP value, and each line represents a feature. Red dots indicate greater feature values, whereas blue dots indicate lower feature values. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II; BUN: blood urea nitrogen; SpO2: oxygen saturation; MAP: mean arterial pressure; PTT: partial thromboplastin time.

Figure 5. SHAP dependence plot of the XGBoost model. (A) SOFA score; (B) SAPS II; (C) respiratory rate; (D) urine output. SHAP values for specific features exceed zero, representing an increased risk of death. The greater the SHAP value of a characteristic, the greater the likelihood of death development. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II.

Figure 5. SHAP dependence plot of the XGBoost model. (A) SOFA score; (B) SAPS II; (C) respiratory rate; (D) urine output. SHAP values for specific features exceed zero, representing an increased risk of death. The greater the SHAP value of a characteristic, the greater the likelihood of death development. SHAP: SHapley Additive exPlanations; SOFA: sequential organ failure assessment; SAPS II: simplified acute physiology score II.

Sensitivity analyses

For patients with non-first ICU admission (N = 6314), the XGBoost model remained robust in predicting mortality in these patients (AUC: 0.821). Detailed results are shown in Supplementary Figure S3.

Class imbalance

The performance results of the up-sampling approach show very similar results (Supplementary Table S3).

Discussion

In this investigation, we constructed and tested six ML models for predicting in-hospital mortality in critically ill patients with CKD. The XGBoost model outperformed other models (including logistic regression, SVM, KNN, decision tree, and RF models) and traditional risk scores (including SOFA score and SAPS II) in predicting the death of critically ill patients with CKD. According to the feature importance evaluation, the four most important features of the XGBoost model that had the greatest predictive potential for mortality were the SOFA score, SAPS II, respiratory rate, and urine output. Moreover, we explain how these characteristics impact the XGBoost model. These results may contribute significantly to understanding ML models for predicting death in critically ill patients with CKD.

In recent years, CKD has profoundly impacted the prognosis and treatment options for several morbidities [Citation17–19]. Furthermore, as the prevalence of CKD continues to rise in the general population and among ICU patients, preexisting CKD may drastically alter the treatment methods for these patients when admitted to the ICU [Citation20–22]. Therefore, to identify those at high risk of clinical deterioration and facilitate early preventive measures that may reduce mortality, it is necessary to develop and promote prediction models that can early and swiftly predict death in critically ill patients with CKD.

In this analysis, the XGBoost model outperformed the other ML models in predicting mortality in CKD patients in critical care. The results of this study agree with those of numerous others. Liu et al. revealed that the XGBoost model outperformed other ML models, including logistic regression, RF, and SVM, in predicting death in acute kidney injury patients [Citation23]. Hu et al. discovered that XGBoost performed better than SVM, KNN, logistic regression, decision tree, Naive Bayes, and RF [Citation11]. A meta-analysis indicated that XGBoost outperformed other ML methods (such as SVM and Bayesian networks) for predicting acute kidney injury [Citation24]. In addition, traditional severity scoring systems, such as the SOFA and SAPS II scores, performed poorly compared to ML models, indicating that they may not be reliable tools for predicting death in critically ill patients with CKD. In addition, our study showed that SOFA or SAPS II scores alone performed poorly in predicting mortality in critically ill patients with CKD compared with the ML model. Although the SOFA and SAPS II scoring systems may estimate the likelihood of bad outcomes in critically ill patients, excluding a significant number of relevant factors from their analyses may result in less accurate prediction than multivariable models [Citation25]. Previous research has demonstrated that when compared to ML models, the SOFA score and SAPS II perform poorly in predictive performance [Citation8].

In this investigation, we used the ML method for the first time to predict in-hospital mortality in critically ill patients with CKD. By ranking the importance of variables in the XGBoost model, we found that SOFA score, SAPS II, respiratory rate, and urine output were the variables that contributed most to predicting mortality in critically ill patients with CKD. The SOFA score is a tool that describes the presence of organ dysfunction [Citation26]. It assigns each of the six organ systems (respiratory, circulatory, renal, hematologic, hepatic, and central nervous system) a daily score between 1 and 4 based on the severity of organ failure, with higher values indicating more severe organ dysfunction [Citation27]. Some studies have shown that high SOFA scores are associated with higher mortality [Citation28]. Similarly, the present research found that the SOFA score was the most significant predictor of death in critically ill patients with CKD, and it was given the highest weight in the XGBoost model. SAPS II is another essential factor that influences mortality. The SAPS II score comprises seventeen factors, with higher scores indicating illness severity [Citation29]. Previous research has shown that SAPS II is related to a greater death rate among ICU patients [Citation30]. In addition, we discovered that respiratory rate is a significant predictor of death in critically ill patients with CKD. Several studies have found an association between respiratory rate and worse outcomes [Citation31]. Our research also showed a correlation between urine output and death among CKD patients in critical care. Oliguria is common in ICU patients and is the ultimate cause of renal parenchymal damage [Citation32]. Some studies have shown that decreased urine output is associated with poor outcomes in critically ill patients [Citation33].

However, this study also has some shortcomings. First, this was retrospective modeling research conducted at a single center using the MIMIC IV database, and we could not identify the causal association between characteristics and outcomes. In order to verify the accuracy of our approach, we need further prospective randomized clinical trials. Second, our research’s retrospective and observational design may inevitably result in selection bias. Third, we estimated specific missing data using padding, which may have led to discrepancies from the actual numbers. Finally, in this work, the model was only tested internally; external validation at multiple centers is needed to confirm the usefulness of the model.

Conclusions

In conclusion, we successfully developed and validated ML models for predicting mortality in critically ill patients with CKD. Among all ML models, the XGBoost model is the most effective ML model that can help clinicians accurately manage and implement early interventions, which may reduce mortality in critically ill CKD patients with a high risk of death.

Ethical approval

MIMIC IV was set up with the approval of the Institutional Review Board at the Massachusetts Institute of Technology. All participant data were anonymized to safeguard their privacy. Due to the use of anonymized health records, ethical approval was not required. This study adheres to the ethical criteria outlined in the Helsinki Declaration of 1964.

Consent form

Due to the use of anonymized health records, informed consent was not required.

Author contributions

Conceptualization: Xunliang Li, Yuyu Zhu, Haifeng Pan, and Deguang Wang; methodology: Xunliang Li and Yuyu Zhu; formal analysis and investigation: Xunliang Li, Wenman Zhao, Rui Shi, Yuyu Zhu, and Zhijuan Wang; funding acquisition: Haifeng Pan and Deguang Wang; supervision: Haifeng Pan and Deguang Wang. All authors read and approved the final manuscript.

Supplemental material

Supplemental Material

Download PDF (497.8 KB)

Acknowledgements

The authors thank all the participants for their contributions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The datasets presented in the current study are available in the MIMIC-IV database (https://physionet.org/content/mimiciv/1.0/).

Additional information

Funding

This work was supported by grants from the Natural Science Foundation of Anhui Province [2008085MH244], Incubation Program of National Natural Science Foundation of China of The Second Hospital of Anhui Medical University [2020GMFY04], Clinical Research Incubation Program of The Second Hospital of Anhui Medical University [2020LCZD01], Co-construction Project of Clinical and Preliminary Disciplines of Anhui Medical University in 2020 [2020lcxk02], and Co-construction Project of Clinical and Preliminary Disciplines of Anhui Medical University in 2021 [2021lcxk032]. No funding bodies had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • GBD Chronic Kidney Disease Collaboration. Global, regional, and national burden of chronic kidney disease, 1990-2017: a systematic analysis for the global burden of disease study 2017. Lancet. 2020;395(10225):1–11.
  • Copur S, Ucku D, Cozzolino M, et al. Hypoxia-inducible factor signaling in vascular calcification in chronic kidney disease patients. J Nephrol. 2022;35(9):2205–2213.
  • Strijack B, Mojica J, Sood M, et al. Outcomes of chronic dialysis patients admitted to the intensive care unit. J Am Soc Nephrol. 2009;20(11):2441–2447.
  • Hutchison CA, Crowe AV, Stevens PE, et al. Case mix, outcome and activity for patients admitted to intensive care units requiring chronic renal dialysis: a secondary analysis of the ICNARC Case Mix Programme Database. Crit Care. 2007;11(2):R50.
  • Denic A, Glassock RJ, Rule AD. Structural and functional changes with the aging kidney. Adv Chronic Kidney Dis. 2016;23(1):19–28.
  • Rimes-Stigare C, Frumento P, Bottai M, et al. Long-term mortality and risk factors for development of end-stage renal disease in critically ill patients with and without chronic kidney disease. Crit Care. 2015;19:383.
  • Alam N, Hobbelink EL, van Tienhoven AJ, et al. The impact of the use of the Early Warning Score (EWS) on patient outcomes: a systematic review. Resuscitation. 2014;85(5):587–594.
  • Hou N, Li M, He L, et al. Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost. J Transl Med. 2020;18(1):462.
  • Du M, Haag DG, Lynch JW, et al. Comparison of the tree-based machine learning algorithms to Cox regression in predicting the survival of oral and pharyngeal cancers: analyses based on SEER Database. Cancers. 2020;12(10):2802.
  • Weis C, Cuénod A, Rieck B, et al. Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nat Med. 2022;28(1):164–174.
  • Hu C, Li L, Huang W, et al. Interpretable machine learning for early prediction of prognosis in sepsis: a discovery and validation study. Infect Dis Ther. 2022;11(3):1117–1132.
  • Zhang Z, Chen L, Xu P, et al. Effectiveness of automated alerting system compared to usual care for the management of sepsis. NPJ Digit Med. 2022;5(1):101.
  • Zhou S, Zeng Z, Wei H, et al. Early combination of albumin with crystalloids administration might be beneficial for the survival of septic patients: a retrospective analysis from MIMIC-IV database. Ann Intensive Care. 2021;11(1):42.
  • Johnson AE, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035.
  • Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–1139.
  • Marshall A, Altman DG, Holder RL, et al. Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines. BMC Med Res Methodol. 2009;9:57.
  • Bansal N, Fan D, Hsu CY, et al. Incident atrial fibrillation and risk of end-stage renal disease in adults with chronic kidney disease. Circulation. 2013;127(5):569–574.
  • Eddy AA, Neilson EG. Chronic kidney disease progression. J Am Soc Nephrol. 2006;17(11):2964–2966.
  • Meguid El Nahas A, Bello AK. Chronic kidney disease: the global challenge. Lancet. 2005;365(9456):331–340.
  • Go AS, Chertow GM, Fan D, et al. Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med. 2004;351(13):1296–1305.
  • Manjunath G, Tighiouart H, Ibrahim H, et al. Level of kidney function as a risk factor for atherosclerotic cardiovascular outcomes in the community. J Am Coll Cardiol. 2003;41(1):47–55.
  • Hotchkiss JR, Palevsky PM. Care of the critically ill patient with advanced chronic kidney disease or end-stage renal disease. Curr Opin Crit Care. 2012;18(6):599–606.
  • Liu J, Wu J, Liu S, et al. Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model. PLOS One. 2021;16(2):e0246306.
  • Song X, Liu X, Liu F, et al. Comparison of machine learning and logistic regression models in predicting acute kidney injury: a systematic review and meta-analysis. Int J Med Inform. 2021;151:104484.
  • Wu J, Huang L, He H, et al. Red cell distribution width to platelet ratio is associated with increasing in-hospital mortality in critically ill patients with acute kidney injury. Dis Markers. 2022;2022:4802702.
  • He Y, Xu J, Shang X, et al. Clinical characteristics and risk factors associated with ICU-acquired infections in sepsis: a retrospective cohort study. Front Cell Infect Microbiol. 2022;12:962470.
  • Minne L, Abu-Hanna A, de Jonge E. Evaluation of SOFA-based models for predicting mortality in the ICU: a systematic review. Crit Care. 2008;12(6):R161.
  • Zhu Y, Zhang R, Ye X, et al. SAPS III is superior to SOFA for predicting 28-day mortality in sepsis patients based on sepsis 3.0 criteria. Int J Infect Dis. 2022;114:135–141.
  • Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American Multicenter Study. JAMA. 1993;270(24):2957–2963.
  • Mirzakhani F, Sadoughi F, Hatami M, et al. Which model is superior in predicting ICU survival: artificial intelligence versus conventional approaches. BMC Med Inform Decis Mak. 2022;22(1):167.
  • Barthel P, Wensel R, Bauer A, et al. Respiratory rate predicts outcome after acute myocardial infarction: a prospective cohort study. Eur Heart J. 2013;34(22):1644–1650.
  • Macedo E, Malhotra R, Bouchard J, et al. Oliguria is an early predictor of higher mortality in critically ill patients. Kidney Int. 2011;80(7):760–767.
  • Kellum JA, Sileanu FE, Murugan R, et al. Classifying AKI by urine output versus serum creatinine level. J Am Soc Nephrol. 2015;26(9):2231–2238.