857
Views
5
CrossRef citations to date
0
Altmetric
Article

Machine Learning vs. Survival Analysis Models: a study on right censored heart failure data

, & ORCID Icon
Pages 1899-1916 | Received 03 Sep 2021, Accepted 26 Mar 2022, Published online: 11 Apr 2022
 

Abstract

Machine Learning Models are known to understand the intricacies of the data well, but native ML models cannot be used in time-to-event analysis due to censoring. In this paper, we explore the use of Machine Learning Models in the field of Survival Analysis using right censored Heart Failure Clinical Records Dataset. For this purpose, we first identify the top most important features responsible for death due to heart failure using Recursive Feature Elimination and then see how Machine Learning models can be adapted to improve the time-to-event analysis outcomes. To deal with this, Machine Learning Models are modified using the techniques Inverse Probability of Censoring Weighting (IPCW) and IPCW Bagging and are trained using the processed dataset alongside various survival analysis models. Area Under the time-dependent ROC (AUC) is used as a performance metric. The results reveal that the average AUC value for Survival Analysis Models is 0.51 while that of Machine Learning Models processed using IPCW increased to 0.80, and those processed using IPCW Bagging increased by 0.82. This reflects that Machine Learning models outperform Survival Analysis models in the case of time-to-event analysis of right censored dataset, and hence, are better indicators of risk of heart disease.

Disclosure statement

There is no conflict of interest.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Acknowledgements

The authors acknowledge the support provided by Indian Institute of Technology Hyderabad, India.

Notes

1 Code is adapted from @ [Gonzalez Ginestet et al. Citation2021], R and Python languages are used for coding.

2 Cure fractions are calculated by considering mean values for the features which are not varying, i.e., mean EF is 38.08% and mean Creatinine is 1.394 mg/dL.

3 milliequivalents per litre

4 milligrams per decilitre

5 microliter

6 micrograms per liter

7 Smooth functions considered are - Adaptive smooth spline over the covariate EF and Thin plate regression spline over Creatinine multiplied with a factor of EF.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,090.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.