4,440
Views
4
CrossRef citations to date
0
Altmetric
Gastroenterology & Hepatology

Feasibility of machine learning-based modeling and prediction using multiple centers data to assess intrahepatic cholangiocarcinoma outcomes

, , , , , , , & show all
Pages 215-223 | Received 20 Sep 2022, Accepted 13 Dec 2022, Published online: 28 Dec 2022

Abstract

Background and aims

Currently, there are still no definitive consensus in the treatment of intrahepatic cholangiocarcinoma (iCCA). This study aimed to build a clinical decision support tool based on machine learning using the Surveillance, Epidemiology, and End Results (SEER) database and the data from the Fifth Medical Center of the PLA General Hospital in China.

Methods

4,398 eligible patients from the SEER database and 504 eligible patients from the hospital data, who presented with histologically proven iCCA, were enrolled for modeling by cross-validation based on machine learning. All the models were trained using the open-source Python library scikit-survival version 0.16.0. Shapley additive explanations method was used to help clinicians better understand the obtained results. Permutation importance was calculated using library ELI5.

Results

All involved treatment modalities could contribute to a better prognosis. Three models were derived and tested using different data sources, with concordance indices of 0.67, 0.69, and 0.73, respectively. The prediction results were consistent with those under actual situations involving randomly selected patients. Model 2, trained using the hospital data, was selected to develop an online tool, due to its advantage in predicting short-term prognosis.

Conclusion

The prediction model and tool established in this study can be applied to predict the prognosis of iCCA after treatment by inputting the patient’s clinical parameters or TNM stages and treatment options, thus contributing to optimal clinical decisions.

    KEY MESSAGES

  • A prognostic model related to disease staging and treatment mode was conducted using the method of machine learning, based on the big data of multi centers.

  • The online calculator can predict the short-term survival prognosis of intrahepatic cholangiocarcinoma, thus, help to make the best clinical decision.

  • The online calculator built to calculate the mortality risk and overall survival can be easily obtained and applied.

Introduction

Intrahepatic cholangiocarcinoma (iCCA) is the second most common primary liver cancer, accounting for 10–15% of all primary liver cancers [Citation1], with rising incidence and mortality rate globally [Citation2]. Radical surgery is the curative treatment for early-stage iCCA patients. However, many patients are found to be in the advanced stage and lose the opportunity to undergo the radical surgery. In addition, after iCCA patients undergoing radical surgery, the five-year survival still remains poor, less than one-third [Citation3]. Achieving optimal outcomes depends on a skilled, multidisciplinary team that is experienced with the management of advanced biliary disease [Citation4]. The alone or combination of the following methods, chemotherapy, locoregional therapies (such as percutaneous ablation, transarterial chemoembolization and external radiation) and systemic therapy, represent valid options to improve survival in iCCA patients, especially for patients who are poor candidates for resection [Citation5]. Gemcitabine and cisplatin combination regimen is recommended as the standard first-line systemic therapy for iCCA patients [Citation6]. The role of targeted therapy and immunotherapy is still inconclusive, and patient subgroups that can benefit from monotherapy or combination therapy with standard-of-care chemotherapy remains to be identified [Citation4].

There is still no mature recommended regimen for the multidisciplinary treatment of iCCA. Therefore, the retrospective studies based on historical data are of great significance in clinical decision-making. This study aimed to retrospectively analyze the staging, treatment, and prognosis of iCCA patients from database of the Fifth Medical Center of PLA General Hospital in China (hospital data) and the Surveillance, Epidemiology, and End Results (SEER) database, which is supported by the Surveillance Research Program of the NCI Division of Cancer Control and Population Sciences [Citation7], and anticipate to build a predictive model that correlates treatment methods to assist doctors and patients in making the optimal clinical decisions.

Methods

Study design and cohort

Patients histologically diagnosed with iCCA at the Fifth Medical Center from 2010 to 2020 were enrolled. Patients were excluded if they meet the following criteria: (1) pathological diagnosis of iCCA performed at other institutions; (2) patients with extra CCA or gallbladder cancer; (3) patients with mixed or combined hepatocellular carcinoma-CCA; (4) patients who died within one month after operation; and (5) patients who were hospitalized only once without a second follow-up ().

Figure 1. Data screening process from the hospital database (1a) and SEER database (1b).

Figure 1. Data screening process from the hospital database (1a) and SEER database (1b).

Patients with histologically proven iCCA who had valid follow-up data from 2000 to 2018 in the SEER Plus database were included. Meanwhile, the patients were required to have documented ethnicity and clear TNM staging. Patients were excluded if they were diagnosed with iCCA through autopsy or death certificates only. ICD-10-CM codes C22.1 and ICD-O-3 code 8160/3 were used for screening. The data selection process for the SEER database was shown in .

Intervention and outcome variables

The common variables were extracted from both data sources, including sex, age, race, tumor size, tumor number, peritoneal invasion, vascular invasion, perforation of visceral peritoneum, local extrahepatic structure invasion, lymph node metastasis, and distant metastasis. The fibrosis score was discarded, due to missing values among approximately 90% of the patients in the SEER database. Tumor staging was coded and unified according to the seventh or eighth edition of the TNM staging system of the American Joint Committee on Cancer of the United States [Citation8]. Interventions were classified as surgical therapy, locoregional therapy, radiation, and systemic therapy, which mainly involve chemotherapy in this study and a small amount of targeted therapy and immunotherapy. The variable labeled ‘Specific Surgery of Primary Site Codes’ in the SEER database was split into surgical therapy and locoregional therapy, with the variable coded 10–17 as locoregional therapy and the others as surgical therapy.

Data analysis and model construction

In this study, the prognostic models that could predict survival by inputting information of tumor stage and treatment options were explored and validated to help clinicians making the optimal decisions. The variables, such as age, sex, tumor characteristics, TNM stages, and treatment modalities, were subjected to univariate Cox regression analysis.

The data consist of a binary value representing whether the patient died or alive and a time value representing survival time or observation and follow-up time. The random survival forest, an extension of the random forest model, which is suitable for analyzing time-to-event data, was chosen to construct the prognostic models in the open-source Python library scikit-survival version 0.16.0 (Python version 3.7.6) [Citation9,Citation10]. Three models, generated and validated using different data sources, were built and compared. The training and testing sets for the three models were obtained as follows:

  • Model 1: The model was trained using 4398 patients from the SEER dataset and tested with 504 patients from hospital data.

  • Model 2: Training was conducted using 504 patients from the hospital, while testing was conducted among 499 Asian patients from the SEER dataset, which had comparative maximum and minimum survival times with the training set. Since the training set is from single center, bootstrap resampling was used for internal validation.

  • Model 3: The two datasets were mixed and randomly formed a training set of 80% (3661) and a test set of the remaining 20% (916) after removing duplicates.

The performance of the models was tested based on the concordance index (C-index) of the training and test datasets, as well as the C-index based on the out-of-bag estimate of the training sets. Cumulative/dynamic time-dependent areas under the receiver operating characteristic (ROC) curve (AUC) was also calculated to evaluate the models [Citation9]. As part of the evaluation of Model 3, a 95% confidence interval (CI) was calculated for each performance assessment by bootstrapping a sample (60%) from the training set and test set 500 times.

To evaluate the effect of the features, permutation importance was calculated by measuring the reductions in the test score after randomly shuffling each feature [Citation11]. Shapley additive explanations (SHAP) was also used to explain these models [Citation12,Citation13]. SHAP is a model interpretation package developed in Python that can interpret the output of any machine-learning model. For each predicted sample, the SHAP value was assigned to each feature. The absolute value of the SHAP was larger, the influence of the feature was greater. The sign of the value indicated whether the feature positively or negatively influenced the result, and the red and blue colors represented the value of the feature. If the color of the dot was more on the red side, then the value of the feature was relatively high in all the samples. Similarly, it is bluer, the value was lower. Permutation importance was calculated using the library ELI5 (version 0.11.0), and SHAP Python framework (version 0.40.0).

Results

Baseline characteristics

Data of 4,398 eligible patients with iCCA from the SEER database and 504 eligible patients from hospital data were extracted (). For the 4,398 eligible patients, the mean age was 65 years, 50.16% (n = 2206) patients were male. White race accounted for 77.44% (n = 3406), followed by Asian or Pacific Islander (12.61%), Black (9.11%), and American Indian/Alaska Native race (0.84%). The proportion of patients in stage IV was the highest at 42.43% (n = 1866), followed by stage I (20.05%), stage IIIB (16.60%), stage II (15.39%), and stage IIIA (5.53%). Systemic therapy was most common in 2429 patients (55.23%). A total of 1122 (25.51%) patients underwent surgical therapy, 687 (15.56%) patients received radiation, and only 114 (2.59%) patients received locoregional therapy, 25.74% patients did not receive any of the above treatment, which could roughly be regarded as palliative care. For the 504 patients from the hospital data, the mean age was 55 years, and 69.04% were male. The proportion of patients in stage II iCCA was the highest (44.44%; n = 224), followed by stages I (25.99%), IV (15.08%), IIIB (12.70%), and IIIA (1.79%). Locoregional therapy was the most common treatment modality in 293 patients (58.13%), followed by resection in 231 patients (45.83%), systemic therapy in 64 patients (12.7%), and radiation in 56 patients (11.11%), 37.9% of patients didn’t receive any of the above treatment.

Table 1. Demographic characteristics from SEER database and the Fifth Medical Center of PLA General Hospital dataset.

As shown in , univariate Cox regression analysis was performed to describe the possible prognostic factors with the overall survival. Considering the common variables of the two databases, the hazard ratios and the corresponding P values and CIs of age, sex, race, TNM stage, and the treatment methods on the prognosis were only described. For the SEER data, the risk of death in males was greater than that in females, and the risk increased with age and TNM stage, and all treatments could reduce the risk, however race was not a significant factor that can affect the prognosis of iCCA patients,. For hospital data, only indicated surgery and TNM stage had the same significant influence on the patients’ prognosis, which might be due to few patients with certain data variables.

Table 2. Univariate cox regression analysis of two database.

Model performance

All eight variables in were used to predict the overall survival in the different models. The performance evaluation measurements of the models were summarized in . The test C-indexes for model 1, 2, and 3 were 0.67, 0.69 and 0.73 (95% CI: 0.71–0.73), respectively. The C-index of internal validation for model 2 was 0.7 by bootstrap resampling 100 datasets from the hospital data of length 400 to 500. The time-dependent AUCs at 15-time points, which were selected within the 5th and 95th percentiles from the survival time distribution of each model’s corresponding test set, were shown in . Model 3 outperformed Models 1 and 2 at all-time points, and Model 1 was better at predicting long-term survival status (after 15 months), while Model 2 was better at predicting short-term survival status (within 15 months).

Figure 2. Time-dependent AUC for the three models (Dash-line: mean AUC).

Figure 2. Time-dependent AUC for the three models (Dash-line: mean AUC).

Table 3. Model performance summary.

Feature importance of the three models

The feature importance rankings based on permutation importance for all three models were listed in . All of these were calculated using the test datasets described in the Methods section. SHAP was also used to explain the models, and the summary plots were shown in . In all three models, ‘Surgical Therapy’, ‘Systemic Therapy’, and ‘TNM Stage’ were the three most primary variables in predicting the survival time of iCCA patients. From the SHAP summary plots, the following patterns were observed:

Figure 3. SHAP summary plots of three models.

Note: Each point on the plot represents a particular feature of a particular patient. Its y-coordinate is determined by the feature that the point represents, and its x-coordinate is determined by its impact on the model’s output, which, in our case, is the risk score. The color of the point indicates its value from high to low, according to the color bar on the right. The features on the y-axis are ordered by their importance.

Figure 3. SHAP summary plots of three models.Note: Each point on the plot represents a particular feature of a particular patient. Its y-coordinate is determined by the feature that the point represents, and its x-coordinate is determined by its impact on the model’s output, which, in our case, is the risk score. The color of the point indicates its value from high to low, according to the color bar on the right. The features on the y-axis are ordered by their importance.

Table 4. Feature importance calculated based on permutation importance.

  • TNM Stage: An advanced stage yielded a higher risk score.

  • Surgical therapy: Patients with history of surgical therapy had lower risk scores, while those without history of surgical therapy had higher risk scores.

  • Locoregional therapy: Without previous locoregional therapy, there was no significant impact on the risk score, but with history of locoregional therapy would lower the risk score.

  • Systemic therapy: Similar to surgical therapy, patients with history of systemic therapy had lower risk scores, while those without history of systemic therapy had higher risk scores.

  • Radiation: The absence of previous radiotherapy did not significantly influence the risk scores, but a history of radiotherapy lowered the risk scores.

  • Age: Older patients tended to have higher risk scores.

  • Sex: Males were more likely to have higher risk scores than females.

  • Race: No significant influence.

Model demonstration on random patients

To visualize the prediction results of the models, six patients were randomly selected from the hospital data, the features of patients were input into three models for prediction, and the results were shown in . The upper three subfigures showed the survival functions of each model, which demonstrate the change in survival probability over time for different patients. The lower three subfigures showed the cumulative hazard function of each model, which reflected the cumulative risk of death of patients at different time points. The curves consisted of the true survival times of the patients.

Figure 4. Six random patients’ survival functions and cumulative hazard functions were predicted by three different models.

Figure 4. Six random patients’ survival functions and cumulative hazard functions were predicted by three different models.

Model output

Since the overall survival of patients with iCCA was relatively poor, short-term prediction was more helpful for tailoring clinical decision making. In addition, Model 2 was trained using the hospital data and tested using Asian patients in the SEER database, it was more suitable for Chinese patients, considering the etiological differences between different races and the treatment disparity of different cohorts [Citation14]. Thus, we chose to develop a web application based on Model 2, which was generated from hospital data, using the public website (http://icc-predict.dr-gobroad.com/). Overall survival and mortality risk could be quickly estimated by inputting clinical parameters, TNM stage, and treatment options (Supplementary Fig. 1).

Discussion

In this study, the machine learning-based model provided a convenient tool not only for predicting the outcome of iCCA but also for improving the outcome by helping physicians to make the optimal clinical decision. Although many studies focusing on the prognosis or predictive models of ICC, few studies were involved in the main treatment methods, thus could not be directly used for the option of multidisciplinary treatmentoption. Apart from the prognostic factors in 8th AJCC staging system, other variables such as microvascular invasion, tumor budding and perineural invasion are also strong independent predictors of survival for the iCCA patients after resection, however these variables can only be accurately evaluated in those patients who have undergone surgery [Citation14,Citation15]. To explore the common prognostic factors of different stages of iCCA, we also used the hospital data to generate a nomogram involving ALB, LDH, serum iron, FIB, Ca125, Ca199, tumor number, lymph node invasion and distant metastasis [Citation16]. There are many other serum indicators reported to be related to the prognosis of iCCA, such as CRP [Citation17,Citation18], CEA [Citation19], D-dimer [Citation20], etc. Ca199 is a frequently mentioned prognostic marker of iCCA, and it is also closely related to the diagnosis of iCCA [Citation21]. Due to its missing in SEER database, Ca 199 was not included in this study. Considering the main purpose of this study, the power of prognostic factors and the accessibility of data, we finally used the variables including age, sex, the factors in the TNM stage and the main treatment methods to build the model. In the context of absence of mature recommendations, models built with treatment options to predict survival can serve as a powerful tool for new subjects [Citation22,Citation23].

Machine learning has been increasingly applied in clinical settings to assist physicians with better recommendations [Citation24,Citation25]. Compared with traditional data analysis, machine learning is easier to process the complex and multi-dimensional big data [Citation26]. In this study, the machine learning involved big data with obvious heterogeneity, such as regional differences, ethnic differences and year-related treatment methods differences, and different combination of treatment methods. It was an innovative attempt to help us build a relatively stable model from the complex data.

Due to the different data sources, the inclusion/exclusion criteria of the SEER and hospital data were not completely consistent. The same criteria were used to select the data, for example, the same version of the TNM stage. Therefore, we believed the two datasets could be used to validate each other. A limitation of this study was that the number of subjects in a local hospital was relatively small. Although we used data from the SEER database for external validation, there remains lacking of some important variables because of the differences between the two data sources. For example, we could not directly obtain information regarding the underlying causes of liver disease from the SEER database. In addition, because of the sample size, the treatment methods could only be generally divided into surgical therapy, locoregional therapy, radiation, and systemic therapy. A few patients received immunotherapy, thus they could only be included in the systemic therapy. To minimize the data inconsistency, model 2 was trained using the hospital data and tested with selected ethnicity and survival time to generate the network calculator. Thus, this prediction tool should have a better prediction effect among the Chinese and Asian populations.

Novel detection and treatment methods should be explored to predict poor prognosis in iCCA patients. Genomic and transcriptome profiling can be used to identify mutations or aberrations of targeted genes (such as TP53, KRAS, IDH1/2 and FGFR) for novel therapies (such as IDH1/2 inhibitors and FGFR inhibitors) [Citation27]. Liquid biopsies could play a major role as minimally invasive screening and diagnostic biomarkers, prognostic tools and therapeutic monitoring targets [Citation28]. Furthermore, they can also be referred to or encouraged to enter ongoing clinical trials and ultimately strive for the best outcomes.

With the emergence of new treatment methods, current prediction models are need to be updated to meet the clinical standards through incorporating more valuable data. We are also exploring a dynamic artificial intelligence model that can continuously integrate and learn new medical records, thus achieving a more accurate prediction. Our machine learning-based prediction model for iCCA is expected to serve as an effective tool for supplementing high-evidence clinical practice and contribute to physicians’ clinical decisions conveniently.

Ethical approval

The study was reviewed and approved by the Ethics Review Committee at The Fifth Medical Center of Chinese PLA General Hospital.

Author contributions

Shuang-Nan Zhou, Da-Wei Jv and Xiang-Fei Meng drafted the manuscript. Ning Zhang, Da-Wei Jv, and Yin-Ying Lu designed the whole research. Xiang-Fei Meng participated in the design of the research from the perspective of hepatobiliary surgery. Ning Zhang and Yin-Ying Lu revised the manuscript. Chun Liu abstracted the data from the SEER database. Chun Liu and Ze-Yi Wu analyzed the data and generated the figures. Xiang-Fei Meng rechecked the TNM stage of the data. Na Hong double-checked the process of data-abstracting and the analysis. Shuang-Nan Zhou and Jing-Jing Zhang collected the data in The Fifth Medical Center of Chinese PLA General Hospital and completed the follow-up. All authors approved the final version of the manuscript.

Supplemental material

Supplemental Material

Download MS Word (11.2 KB)

Supplemental Material

Download PNG Image (558.7 KB)

Acknowledgements

None.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study involve two parts: data from the SEER are available from the author Chun Liu, and the data from the hospital are regulated by Chinese law, so only authorized staff can access the confidential data and the details can be discussed with the corresponding author Dr. Ning Zhang.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

The study was supported by Capital’s Funds for Health Improvement and Research [NO. Z181100001718075].

References

  • Mejia JC, Pasko J. Primary liver cancers: intrahepatic cholangiocarcinoma and hepatocellular carcinoma. Surg Clin North Am. 2020;100(3):535–549.
  • Gruttadauria S, Barbera F, Pagano D, et al. Liver transplantation for unresectable intrahepatic cholangiocarcinoma: the role of sequencing genetic profiling. Cancers. 2021;13(23):6049.
  • Tashiro S, Tsuji T, Miyake H, et al. Strategy for improving the prognosis of patients with intrahepatic cholangiocarcinoma by surgical treatment: considerations based on experience and a literature review. J. Med. Invest. 2021;68(1.2):15–21. (
  • Valle JW, Kelley RK, Nervi B, et al. Biliary tract cancer. Lancet. 2021;397(10272):428–444.
  • Kodali S, Shetty A, Shekhar S, et al. Management of intrahepatic cholangiocarcinoma. J Clin Med. 2021;10(11):2368.
  • Kelley RK, Bridgewater J, Gores GJ, et al. Systemic therapies for intrahepatic cholangiocarcinoma. J Hepatol. 2020;72(2):353–363.
  • Surveillance, Epidemiology, and End Results (SEER) Dataset; [cited 2021 Jun]. Available from: https://seer.cancer.gov/data-software/.
  • Lee AJ, Chun YS. Intrahepatic cholangiocarcinoma: the AJCC/UICC 8th edition updates. Chin Clin Oncol. 2018;7(5):52.
  • Pölsterl S. Scikit-survival: a library for time-to-event analysis built on top of scikit-learn. J Mach Learn Res. 2020;21:1–6.
  • Ishwaran H, Kogalur UB, Blackstone EH, et al. Random survival forests. Ann Appl Stat. 2008;2(3):841–860.
  • Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
  • Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Conference on Neural Information Processing Systems. 2017; December: 4768–4777.
  • Lundberg SM, Nair B, Vavilala MS, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomed Eng. 2018;2(10):749–760.
  • Tanaka M, Yamauchi N, Ushiku T, et al. Tumor budding in intrahepatic cholangiocarcinoma: a predictor of postsurgery outcomes. Am J Surg Pathol. 2019;43(9):1180–1190.
  • Wei T, Zhang XF, He J, et al. Prognostic impact of perineural invasion in intrahepatic cholangiocarcinoma: multicentre study. Br J Surg. 2022;109(7):610–616.
  • Zhou SN, Lu SS, Ju DW, et al. A new prognostic model covering all stages of intrahepatic cholangiocarcinoma. J Clin Transl Hepatol. 2022;10(2):254–262.
  • Yeh YC, Lei HJ, Chen MH, et al. C-Reactive protein (CRP) is a promising diagnostic immunohistochemical marker for intrahepatic cholangiocarcinoma and is associated with better prognosis. Am J Surg Pathol. 2017;41(12):1630–1641.
  • Zheng BH, Yang LX, Sun QM, et al. A new preoperative prognostic system combining CRP and CA199 for patients with intrahepatic cholangiocarcinoma. Clin Transl Gastroenterol. 2017;8(10):e118.
  • Moro A, Mehta R, Sahara K, et al. The impact of preoperative CA19-9 and CEA on outcomes of patients with intrahepatic cholangiocarcinoma. Ann Surg Oncol. 2020;27(8):2888–2901.
  • Chen Q, Zheng Y, Zhao H, et al. The combination of preoperative D-dimer and CA19-9 predicts lymph node metastasis and survival in intrahepatic cholangiocarcinoma patients after curative resection. Ann Transl Med. 2020;8(5):192.
  • Lee JW, Lee JH, Park Y, et al. Prognostic impact of perioperative CA19-9 levels in patients with resected perihilar cholangiocarcinoma. J Clin Med. 2021;10(7):1345.
  • Rafique R, Islam SMR, Kazi JU. Machine learning in the prediction of cancer therapy. Comput Struct Biotechnol J. 2021;19:4003–4017.
  • Kourou K, Exarchos TP, Exarchos KP, et al. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.
  • Mosele F, Remon J, Mateo J, et al. Recommendations for the use of next-generation sequencing (NGS) for patients with metastatic cancers: a report from the ESMO precision medicine working group. Ann Oncol. 2020;31(11):1491–1505.
  • Miao R, Chen HH, Dang Q, et al. Beyond the limitation of targeted therapy: improve the application of targeted drugs combining genomic data with machine learning. Pharmacol Res. 2020;159:104932.
  • Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–1930.
  • Kennedy L, Zhang W, Ekser B, et al. Current advances in basic and translational research of cholangiocarcinoma. Cancers. 2021;13(13):3307.
  • Rompianesi G, Di Martino M, Gordon-Weeks A, et al. Liquid biopsy in cholangiocarcinoma: current status and future perspectives. World J Gastrointest Oncol. 2021;13(5):332–350.