1,414
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Machine Learning Approaches-Driven for Mortality Prediction for Patients Undergoing Craniotomy in ICU

, , , , , , , & show all
Pages 1658-1664 | Received 24 Jan 2021, Accepted 16 Nov 2021, Published online: 26 Jan 2022

ABSTRACT

Objectives

We aimed to predict the mortality of patients with craniotomy in ICU by using predictive models to extract the high-risk factors leading to the death of patients from a retrospective a study.

Methods

Five machine-learning (ML) algorithms were applied for training on mortality predictive models with the data from a surgical intensive care unit (ICU) database of the Fujian Provincial Hospital in China. The accuracy, precision, recall, f1 score and the area under the receiver operator characteristic curve (AUC) were used to evaluate the performance of different models, and the calibration of the model was evaluated by brier score.

Results

We demonstrated that eXtreme Gradient Boosting (XGBoost) was more suitable for the task, demonstrating a AUC of 0.84. We analyzed the feature importance with the Local Interpretable Model-agnostic Explanations (LIME) analysis and further identified the high-risk factors of mortality in ICU through this study.

Conclusions

This study established the mortality predictive model of patients who had undergone craniotomy in ICU. Identification of the factors that had great influence on mortality has the potential to provide auxiliary decision support for clinical medical staff on their practices.

Introduction

Craniotomy is a high-risk procedure in which a neurosurgeon surgically removes a section of the skull in order to gain access to the brain. It is frequently associated with various complications with high recurrence rate and high mortality rate usually due to the poor condition of the patients and other factors such as inappropriate timing of treatment. In most medical situations, patients post-craniotomy should enter the ICU for monitoring in order to reduce the mortality as much as possible (Citation1). Therefore, it is important to study on the prognosis of patients post-craniotomy in ICU as it will be beneficial to aid practices in ICU in future.

There are different situations due to which patients need craniotomy. Decompressive Craniectomy (DC) is a key treatment for patients with Traumatic Brain Injury (TBI), especially severe TBI (Citation2). The outcome following DC is poor and Aarabi et al. demonstrated that the overall mortality rate was 28% after DC (Citation3). Additionally, the mortality is significantly higher for certain population of patients who received operation due to severe DC (Citation4). In addition,, craniotomy is also needed for patients with acute subdural hemorrhage (ASDH), the most lethal head injuries with a high mortality rate (Citation5). Hematoma evacuation, a type of craniotomy, is usually used in patients with ASDH. A study by Wilberger et al. showed that the timing of operative intervention for clot removal was the critical factor for mortality (Citation6). For patients with brain tumors, craniotomy is often inevitable as well. Gijtenbeek et al. conducted a study of the postoperative situation of elderly patients who undergo meningiomas resection and found that the surgical mortality was 14% and 17% after 6 months (Citation7). Some studies including Ntali et al. (Citation8) and Daly et al. (Citation9) found that operation due to pituitary adenomas, such as Nonfunctioning pituitary Adenomas (NFAs), has a high morbidity and mortality, and Chang et al. found that the prognosis of patients after surgery, such as recurrence rate and mortality were not clear (Citation10). Aneurysm incarceration and aneurysm resection are the common surgical procedures for aneurysms patients. Solomon et al. discovered that the postoperative death of patients was a significant outcome of aneurysmal craniotomy (Citation11). In summary, many medical observational and analytical researches (Citation12) and traditional statistical methods (Citation13) have set their sights on the study of prognosis, especially the mortality of patients undergoing craniotomy.

In recent years, the rapid development of medical big data and machine learning (ML) technology has shown that in medical fields, ML models could discover interactive, nonlinear and high-order effects in the predictive variables, showing better performance than traditional methods (Citation14). Here, we established Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Artificial Neural Networks (ANN) and XGBoost models based on ML and identified XGBoost as the best algorithm that has high classification accuracy and well interpretability, providing a feasible postoperative hospital mortality prediction of patients post-craniotomy in ICU.

Materials and methods

Data source

We retrospectively collected health data from the surgical ICU database of Fujian Provincial Hospital between 2005.03.23 and 2018.09.20.

Patient selection

The inclusion and exclusion criteria of the study are described as follows. According to the institutional guidelines and consultation of experts in the field, adult patients (older than 18 years of age) who needed craniotomy due to DC, hematoma evacuation, resection of meningioma, resection of pituitary adenoma, aneurysm intervention and entered postoperative ICU care were included in the study. 31.94% of the patients received DC, 25.60% of patients had aneurysm intervention, 25.12% of patients had hematoma evacuation, 16.87% of patients had resection of meningioma and 5.74% of patients had resection of pituitary adenoma (). For patients under ICU monitoring and treatment, we screened patients with at least the first complete ICU examination records when they entered the ICU. Some dynamic variables such as vital signs and laboratory tests were represented by first, maximum, and minimum values of patients in ICU. The main outcome of our study is the mortality of patients, whose death were registered in death record. Patients whose length of stay were less than 24 hours were excluded because surgical specialist of ICU considered that craniotomy has little effect on the mortality of those patients. Patients discharged automatically 24 hours after admission were also considered as dead cases as a reason of their physical condition were usually so bad that there was no therapeutic significance and family members usually were unwilling to let patients die cold in the hospital due to local customs. 835 patients were collected by using the inclusion and exclusion criteria.

Table 1. Statistics result of the causes of craniotomy

Variable selection

After patient selection according to the inclusion and exclusion criteria, we next selected the variables for this study. Based on ML feature engineering technology, it was an unattractive option to select all the variables as the input matrix for ML-based PM but necessary to bring variables related to clinical practices and medical facts into account. The variables would be excluded when the missing rate of which were greater than 30%. According to the recommendations from experts in the field, eventually, a total of 67 variables were selected as the input of PM: the Glasgow Coma Scale (GCS) score. The variables included four demographic characteristic variables including age, gender, smoking history, and drinking history; six vital sign variables including diastolic pressure, systolic pressure, heart rate, etc.; five therapeutic index variables including operation time, anesthesia type, tracheotomy, etc.; seven medical history variables including diabetes, hypertension, chronic renal disease (CRD), etc.; thirty-one laboratory examination variables including platelets (PLT), white blood cells (WBC), red blood cells (RBC), etc.; five brain injury inducements variables including vascular diseases, brain tumor diseases, hydrocephalus, etc.; four hematoma property variables including subarachnoid, subdural, intracerebral, etc.; four infection source variables including pulmonary, urinary, intracranial, etc. (Supplementary Table S1).

Data pre-processing

We imputed the missing value by the mean value (for continuous variables) or mode number (for categorical variables) of each group. The continuous variables were normalized to 0–1. After normalization, the dataset was split into training and testing with a ratio of 9:1 (752:84) subsequently. 10-fold cross-validation was applied to the training set to identify optimal parameters for ML model.

Machine learning models

In this study, we used Scikitlearn (Citation15) package in Python to fit the ML models including the LR algorithm, the traditional linear method which was widely used in medical PM (Citation16–18); the RF algorithm, that generates multiple decision trees and has better interpretability and can establishe the correlation between features; the SVM algorithm, a dichotomous supervised algorithm which can be used in high-dimensional feature space; ANN algorithm, which has been successfully applied in clinical outcome prediction of trauma mortality (Citation19); the XGBoost algorithm, an end-to-end tree boosting system, which was used widely by data scientists to achieve state-of-the-art results on many ML challenges in recent years as for its advantages on over fitting and missing value processing (Citation20).

Local interpretable model-agnostic explanations (LIME)

The explanation of the black-box model like ANN or SVM is a problem discussed in recent years (Citation21). While in the field of healthcare, it is very important for physicians and staff to understand the decision-making process of model so that they can benefit from the early prediction of mortality fitted by an interpretable model instead of a black-box (Citation22). LIME algorithm could be used to interpret the model locally (but not globally) by selecting a typical observation object to explain (Citation23). LIME could be used to interpret the model and discover the high-risk factors in a model-agnostic angle.

Statistical analysis

All statistical analyses were performed using the package of Scipy (1.2.1) from Python (version 3.7.3) and R (version x64 4.1.0). Categorical variables were presented as percentages (%), and we used Chi-square tests to determine the significance of the association between variables. The continuous variables were expressed as mean ± standard deviation (mean ± std). For the continuous variables with normal distribution in both death and survival groups, student t-test was used, otherwise Wilcoxon test was used to judge the statistical difference between two groups. We quantified the relationships of all variables that were significantly associated with the outcome in terms of 95% Confidence Interval (CI), odds ratio (OR) calculated by univariate analysis according to Logistic Regression. We took P values < .05 as the standard to express the statistical significance.

The accuracy, precision, recall, f1 score and AUC (Citation24) were calculated to assess the performance of ML models. The performance and calibration of the model were calculated by the Brier Score (a measure of the mean squared difference between estimated risks and the actual outcomes) which varies from 0 to 1, the lower the score is, the higher the accuracy is (Citation25,Citation26).

Results

Descriptive statistics

Data of a total of 835 eligible patients were retrieved in our study, and the mortality of the patients is 8.26% (n = 69). The average age of the cohort is 55.22 years old and the sex ratio is 492: 343 (male: female). From a statistical point of view, there was no significant difference in demographic distribution, previous medical history, infection sources and hematoma properties between the death and survival groups. As for the categorical variables, patients with mechanical ventilation (p = .012) and patients with brain tumor diseases (p = .032) were statistically significant (p < .05). As for the continuous variables, the GCS score (p < .001) and many vital signs and laboratory examination variables were statistically significant such as first diastolic pressure (p = .019), first systolic pressure (p = .021), maximum systolic pressure (p = .007), first heart rate (p = .036), maximum and minimum heart rate (p < .001), first pulse (p = .002), maximum pulse (p = .007), maximum hemoglobin (Hb) (p = .004), minimum Hb (p = .003), maximum and minimum PLT (p < .001), first and maximum prothrombin time (PT) (p < .001), etc., details of statistical results are shown in Supplemental Table S1.

In addition, the results of univariate analysis of all variables in terms of LR were shown in Supplemental Table S2. There were 44 variables that were high risk factors (OR>1, 5% CI>1) for outcome including hernia cerebri (OR = 6.17), diabetes (OR = 2.16), vascular diseases (OR = 2.12), etc. 28 variables were protective factors (OR<1, 95% CI<1) for outcome including minimum calcium (OR = 0.073), brain tumor (OR = 0.433), mechanical ventilation (OR = 0.47), etc., the details of univariate analysis results are shown in Supplemental Table S2.

Performance of machine-learning models on test-set

The parameters of the models were tuned by the 10-fold cross-validation on the training set, and we made a verification on the test set, and the ROC curves of five models were as shown in . The AUC values for LR, RF, SVM, ANN and XGBoost were 0.78, 0.80, 0.78, 0.78 and 0.84, respectively.

Figure 1. ROC curves for LR, RF, SVM, ANN, and XGBoost models in predicting the mortality of patients with craniotomy.

Figure 1. ROC curves for LR, RF, SVM, ANN, and XGBoost models in predicting the mortality of patients with craniotomy.

In addition to AUC, accuracy, precision, recall, f1 score and brier scores of the five models were shown in . The accuracy of all models was over 90%. We further focused on the precision, recall and f1 score, XGBoost displayed overall superiorities. In addition, when measured from the perspective of brier score, LR, RF, ANN and XGBoost were almost at the same level (0.06), better than SVM (0.19).

Table 2. Mortality prediction performance ML models on test sets

Feature-importance

We considered XGBoost as a suitable model to predict the mortality of patients with craniotomy in ICU. We sorted each variable by its feature-importance which was calculated by the F score from the internal of XGBoost in order to reflect the contribution of variables to the outcome. As shown in , we displayed top-10 variables that the model considered to have the greatest impact on mortality. Minimum heart rate, maximum temperature, maximum magnesium (Mg) and minimum White Blood Cell (WBC) were the key variables that had a greater impact on the outcome according to the feature-importance (with a higher F score). In addition, the impact of minimum Albumin (ALB), minimum Uric Acid (UA), minimum Diastolic Pressure, maximum Creatine Kinase Isoenzymes (CK-MB), minimum CK-MB and age on mortality was ranked from high to low ().

Figure 2. Feature importance of XGBoost model sorted by F score to show the features had a greater impact on the outcome.

Figure 2. Feature importance of XGBoost model sorted by F score to show the features had a greater impact on the outcome.

LIME explanation

We explained the black-box ML model in a model-agnostic way by selecting the top 10 variables that had a great impact on the outcome and the critical value of which were found in order to explain the result of model by LIME. As shown in , the left part of the figure showed top 10 variables that had the greatest impact on survival or death from top to bottom. The right part showed the critical values of these 10 variables when they had the greatest impact on survival or death. The result was similar to that of feature-importance, LIME found minimum heart rate (86 Times/min), maximum Mg (1.34 mmol/L), maximum temperature (40.3°C), maximum UA (349.00 μmol/L) and maximum creatine kinase isoenzymes (CK-MB) (22.00 U/L) had greater impact on the mortality. It was a conclusion LIME drawing additionally that patients with mechanical ventilation had a higher mortality rate, consistent with some studies that the prognostic mortality of patients using mechanical ventilation in the ICU is very high (Citation27).

Figure 3. LIME results of XGBoost model.

Figure 3. LIME results of XGBoost model.

Discussion

In this study, we demonstrated that XGBoost had significant advantages over other models such as ANN and LR for patients undergone craniotomy, and it was possible to explain the model to medical workers and identify high-risk factors as well. In the field of craniotomy mortality prediction, LR has been adopted by most scholars. Some scholars used LR to establish the prognosis prediction model of patients with glioma craniotomy. The AUC of the model was 0.71 when the outcome was postoperative death (Citation28). In addition, multivariate logistic regression was used for patients undergoing elective craniotomy to evaluate the relationship between complications caused by the craniotomy and 30-day mortality (Citation29). A study predicted the poor prognosis of craniotomy for malignant tumors in the elderly by using LR, and they believed that old age couldn’t increase the possibility of poor short-term outcomes . It can be seen from above, LR is a feasible model to predict the mortality of patients undergoing craniotomy. However, it is a difficult task for LR when processing a complex dataset with a larger set of predictors. In addition to linear model, some studies tried to use tree model, SVM, neural network and other methods to model the mortality of patients. ANN showed excellent prediction performance compared to other models, e.g., the LR (Citation30–33). Nevertheless, the lack of interpretability (Citation34), which is a main limitation of the black-box neural networks model, making it an obscure concept for medical staff to understand.

By analyzing the feature-importance which was calculated from the internal of XGBoost and locally model-agnostic explanation, it was not a coincidence that minimum heart rate, maximum temperature, maximum Mg, maximum UA and maximum CK-MB measured from ICU were the highest risk factors of the mortality of patients undergoing craniotomy in ICU.

In the field of craniotomy research, there was no consensus about the variables that affect the prognosis of craniotomy from the medical point of view. The most important variables mentioned more were age (Citation35,Citation36), meningitis (Citation37,Citation38) were the high-risk factors of craniotomy mortality. For the high-risk factors which were found in this study, some researches already revealed medical relevance, while others needed further exploration. A study found that the heart rate especially the minimum of heart rate could be considered as one of significant indicators of brain death especially for patients with severe head injury (Citation39). Previous study found that UA was not only a risk factor for cardiac and renal diseases but also had some indicative value with TBI in experiments with mice or patients (Citation40,Citation41). There was little research about the effect of UA especially the highest level of UA detected from ICU on the prognosis of patients with craniotomy. Based on the results of this study, we believed that the effect of UA on the mortality of patients undergoing craniotomy should attract our attention. CK-MB is one of the sensitive indexes of myocardial injury, while there were some findings that CK-MB was related to the poor prognosis of acute brain injury (Citation42). In this study, we found that CK-MB might have a certain impact on the mortality of patients undergoing craniotomy, but there were few studies from this perspective. As for magnesium, one research found that low serum magnesium might be related to the prognosis of severe head injury (Citation43). However, the maximum serum magnesium was not verified. Therefore, follow-up researches in this area should pay more attention to this variable. Fever is a common condition of patients in ICU, which might be caused by many reasons (Citation44). In our dataset, most patients had fever in ICU, so whether the maximum body temperature has great impact on the death of patients undergoing craniotomy remains to be studied.

There are some limitations in this study. First, despite the large scale of dataset of surgical ICU database of Fujian Provincial Hospital (a total of 13,441 patients), only 6.21% (835) of patients post-craniotomy met the requirements. However, there were 143 variables (including the first, maximum and minimum values of patients in ICU) as input after screening. Therefore, fewer patients and more variables might affect the performance of outcome. Second, the proportion of positive and negative samples was seriously unbalanced (1:11), which would misguide the judgment ability of the model. Finally, it might not be applicable for all patients post-craniotomy as for the single data source. There were some missing data and as for the missing values processing, it might cause the imprecision in medical field.

Conclusions

In our study, based on the electronic health records (EHR) for patients undergoing craniectomy in ICU of Fujian Provincial Hospital, we took the mortality of patients as the outcome, exploring the mortality prediction of these patients by applying five ML models, and our retrospective study proved that XGBoost had a better performance than other predictive models by using the data of surgical ICU patients who had undergone craniotomy, and had a better performance in the way to discover the high-risk factors of mortality. After model-interpretation, the variables that had great effect on outcome were given subsequently, and it was approved by Director Yu, the surgical ICU of Fujian Provincial Hospital, that the results of this study were mostly in line with medical common sense and the results were consistent with the general medical knowledge and could provide more decision supports and research directions for medical staff.

Supplemental material

Supplemental Material

Download MS Word (82.2 KB)

Acknowledgments

We sincerely appreciate Yidu Cloud (Beijing) Technology Co. Ltd., Beijing, China for providing technical support in data extracting by using YiduCloud Disease database.

Disclosure statement

Shaobo Wang and Jingqing Xu contributed equally to the study. No potential conflict of interest was reported by the author(s). There are no financial conflicts of interest to disclose.

Supplementary material

Supplemental data for this article can be accessed on the publisher’s website

Additional information

Funding

This work was supported by the National Natural Science Foundation of China [11832003]; Fujian Province Intensive Medical Center Construction Project [2017-510].

References