604
Views
6
CrossRef citations to date
0
Altmetric
Research Article

Mortality forecasting using stacked regression ensembles

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 591-626 | Received 11 Apr 2021, Accepted 25 Oct 2021, Published online: 16 Nov 2021
 

Abstract

Many alternative approaches for selecting mortality models and forecasting mortality have been proposed. The usual practice is to base forecasts on a single mortality model selected using in-sample goodness-of-fit measures. However, cross-validation measures are increasingly being used in model selection, and model combination methods are becoming a common alternative to using a single mortality model. We propose and assess a stacked regression ensemble that optimally combines different mortality models to reduce out-of-sample mean squared errors and mitigate model selection risk. Stacked regression uses a meta-learner to approximate horizon-specific weights by minimizing a cross-validation criterion for each forecasting horizon. The horizon-specific weights determine a mortality model combination customized to each horizon. We use 44 populations from the Human Mortality Database to compare the stacked regression ensemble with alternative methods. We show that, using one-year-ahead to 15-year-ahead out-of-sample mean squared errors, the stacked regression ensemble improves mortality forecast accuracy by 13% - 49% for males and 19% - 90% for females over individual mortality models. The stacked regression ensembles also have better predictive accuracy than other model combination methods, including Simple Model Averaging, Bayesian Model Averaging, and Model Confidence Set. We provide an R package, CoMoMo, that combines forecasts for Generalized-Age-Period-Cohort models.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Mortality rates on the log-scale reduce the impact of older age differences on the model comparison. Higher mortality variability at older ages can distort model comparison. MSE using the log transform treats small differences between small observed and predicted mortality rates approximately the same as big differences between large observed and predicted mortality rates. As a result, the log transform is appropriate for the relative ranking of the models using the MSE. A small difference in MSE may occur when we use mortality rates directly.

2 An alternative imputation technique is the Kalman Smoother. This also fits a random walk with drift but imputes missing values by forecasting from the left and the right, and then applying a smoothing algorithm (Moritz Citation2018). Given the simplistic nature of the random walk, this just linearly interpolates across the missing region. While a valid technique, we believe that it is inappropriate for our context as it provides an artificially good fit.

3 A slight difference in combined mortality rates may arise when we combine the mortality rates using (μ^(x,tny+h))comb=m=1Mwm(h)μ^m(x,tny+h) instead of Equation (Equation10).

4 An equal predictive ability implies that models Li and Lj are equally good based on a given loss function.

5 We assume that the statistical properties of ζab,x,t do not change over time. This implies that the first moment of ζab,x,t is constant, that is, t, ηab=E(ζab,x,t) and the second moment of ζab,x,t is finite, that is, t, E(|ζab,x,t|)<. Therefore, ζab,x,t being weak stationary makes it possible to determine the best mortality model(s) from the initial collection of models (Hansen et al. Citation2011).

6 In this study, the short-term horizon corresponds to a period of one-to-five years, medium-term horizon corresponds to a period of six-to-10 years, and long-term horizon as a period of 11-to-15 years.

7 For BMAB and MCSV, which require a validation set for estimating the weights and selecting the superior models, respectively, we use data from 1960 to 1977 for training and 1978 to 1990 for estimating the weights or selecting the superior models.

8 We view cross-validation as a particular case of stacked regression ensemble where a single model is selected. A stacked regression ensemble relaxes the assumption that one model must be chosen and used to predict mortality rates. We choose multiple mortality models and optimally combine them to maximize out-of-sample accuracy in one step by minimizing the cross-validation criterion (Sridhar et al. Citation1996). In other model combinations such as standard BMA, we do not have an optimization criterion that selects and optimally assigns the weights to each model in one step. Instead, we have to fit each model to the data and measure the Akaike Information criterion, which we then use to calculate the weight for each model independently. Thus, the process of selecting the models and estimating the weights are not done in one step.

9 BMA methods assign weights to each model independently without accounting for how the models differ from each other. BMA methods do not incorporate diversity among the models in the weights. Conversely, stacked regression ensembles concurrently estimate the weights to the models using a meta-learner like a lasso regression. As a result, models that produce similar mortality rate forecasts or have the least forecasting accuracy at each forecasting horizon get small or zero weights due to the presence of alternative, more accurate, and diverse models.

10 MCSC and MCSV select RH as the only superior model for males and hence it is not combined with other models. Therefore, MSEs of RH are similar to both MCSC and MCSV. For females, MCSV selects RH as the only superior model.

11 Uncertainty is the difference between the highest and lowest mortality rate forecasts at any given forecasting horizon (Graefe et al. Citation2014).

12 The confidence interval for each mortality model is R¯m±CD, where R¯m is the mean rank of each model and CD is the critical difference. We provide precise details in Section A.3 of Kessy et al. (Citation2021)

Additional information

Funding

This research is funded by the Australian Research Council Centre of Excellence in Population Ageing Research (CEPAR) project number CE110001029.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 147.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.