144
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Learning from unbalanced catastrophic out-of-pocket health expenditure dataset: blending SMOTE-boosting with ensemble models

Received 14 Mar 2022, Accepted 27 Oct 2022, Published online: 11 Nov 2022
 

ABSTRACT

This study attests to the benefits of synthetic data generation with the Synthetic Minority Oversampling Technique (SMOTE), and it incorporates this procedure with SMOTEBoosting by applying learning algorithms to model unbalanced catastrophic out-of-pocket (OOP) health expenditure dataset. Nationally representative household budget survey data were gathered from Turkish Statistical Institute for the year 2012. A total of 9987 households responded to the survey. The original dataset was highly unbalanced and a total of 0.14% of households faced catastrophic health expenses. SMOTE was used to perform balanced oversampling, and 10 artificial datasets with sizes from 10% to 100% of the majority group of original training data were generated. To predict OOP catastrophic health expenditures, the SMOTEBoosting was embedded with learning algorithms, such as C5.0, random forest (RF), naïve Bayes, and support vector machine. Study results confirm the outstanding prediction performance of the blended strategy of SMOTEBoosting with RF (area under the curve ˃ 0.85) for prediction. A variable importance plot and decision tree visualise that at least 65 years of age is the most important predictor of the catastrophic cases. The findings of this study highlight that multistrategy ensemble learning techniques are useful to model highly unbalanced datasets.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 373.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.