Mortality forecasting using stacked regression ensembles: Scandinavian Actuarial Journal: Vol 2022 , No 7

Abstract

Many alternative approaches for selecting mortality models and forecasting mortality have been proposed. The usual practice is to base forecasts on a single mortality model selected using in-sample goodness-of-fit measures. However, cross-validation measures are increasingly being used in model selection, and model combination methods are becoming a common alternative to using a single mortality model. We propose and assess a stacked regression ensemble that optimally combines different mortality models to reduce out-of-sample mean squared errors and mitigate model selection risk. Stacked regression uses a meta-learner to approximate horizon-specific weights by minimizing a cross-validation criterion for each forecasting horizon. The horizon-specific weights determine a mortality model combination customized to each horizon. We use 44 populations from the Human Mortality Database to compare the stacked regression ensemble with alternative methods. We show that, using one-year-ahead to 15-year-ahead out-of-sample mean squared errors, the stacked regression ensemble improves mortality forecast accuracy by 13% - 49% for males and 19% - 90% for females over individual mortality models. The stacked regression ensembles also have better predictive accuracy than other model combination methods, including Simple Model Averaging, Bayesian Model Averaging, and Model Confidence Set. We provide an R package, CoMoMo, that combines forecasts for Generalized-Age-Period-Cohort models.

Keywords:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Mortality rates on the log-scale reduce the impact of older age differences on the model comparison. Higher mortality variability at older ages can distort model comparison. $MSE$ using the log transform treats small differences between small observed and predicted mortality rates approximately the same as big differences between large observed and predicted mortality rates. As a result, the log transform is appropriate for the relative ranking of the models using the $MSE$ . A small difference in $MSE$ may occur when we use mortality rates directly.

2 An alternative imputation technique is the Kalman Smoother. This also fits a random walk with drift but imputes missing values by forecasting from the left and the right, and then applying a smoothing algorithm (Moritz Citation2018). Given the simplistic nature of the random walk, this just linearly interpolates across the missing region. While a valid technique, we believe that it is inappropriate for our context as it provides an artificially good fit.

3 A slight difference in combined mortality rates may arise when we combine the mortality rates using $(\hat{μ} (x, t_{n_{y}} + h))_{comb} = \sum_{m = 1}^{M} w_{m} (h) {\hat{μ}}_{m} (x, t_{n_{y}} + h)$ instead of Equation (Equation10(10) $\ln {(\hat{μ} (x, t_{n_{y}} + h))}_{comb} = \sum_{m = 1}^{M} w_{m} (h) \ln {\hat{μ}}_{m} (x, t_{n_{y}} + h),$ (10) ).

4 An equal predictive ability implies that models $L_{i}$ and $L_{j}$ are equally good based on a given loss function.

5 We assume that the statistical properties of $ζ_{a b, x, t}$ do not change over time. This implies that the first moment of $ζ_{a b, x, t}$ is constant, that is, $\forall t$ , $η_{a b} = E (ζ_{a b, x, t})$ and the second moment of $ζ_{a b, x, t}$ is finite, that is, $\forall t$ , $E (| ζ_{a b, x, t} |) < \infty$ . Therefore, $ζ_{a b, x, t}$ being weak stationary makes it possible to determine the best mortality model(s) from the initial collection of models (Hansen et al. Citation2011).

6 In this study, the short-term horizon corresponds to a period of one-to-five years, medium-term horizon corresponds to a period of six-to-10 years, and long-term horizon as a period of 11-to-15 years.

7 For $BMAB$ and MCSV, which require a validation set for estimating the weights and selecting the superior models, respectively, we use data from 1960 to 1977 for training and 1978 to 1990 for estimating the weights or selecting the superior models.

8 We view cross-validation as a particular case of stacked regression ensemble where a single model is selected. A stacked regression ensemble relaxes the assumption that one model must be chosen and used to predict mortality rates. We choose multiple mortality models and optimally combine them to maximize out-of-sample accuracy in one step by minimizing the cross-validation criterion (Sridhar et al. Citation1996). In other model combinations such as standard $BMA$ , we do not have an optimization criterion that selects and optimally assigns the weights to each model in one step. Instead, we have to fit each model to the data and measure the Akaike Information criterion, which we then use to calculate the weight for each model independently. Thus, the process of selecting the models and estimating the weights are not done in one step.

9 BMA methods assign weights to each model independently without accounting for how the models differ from each other. $BMA$ methods do not incorporate diversity among the models in the weights. Conversely, stacked regression ensembles concurrently estimate the weights to the models using a meta-learner like a lasso regression. As a result, models that produce similar mortality rate forecasts or have the least forecasting accuracy at each forecasting horizon get small or zero weights due to the presence of alternative, more accurate, and diverse models.

10 $MCSC$ and $MCSV$ select $RH$ as the only superior model for males and hence it is not combined with other models. Therefore, $MSEs$ of $RH$ are similar to both $MCSC$ and $MCSV$ . For females, $MCSV$ selects $RH$ as the only superior model.

11 Uncertainty is the difference between the highest and lowest mortality rate forecasts at any given forecasting horizon (Graefe et al. Citation2014).

12 The confidence interval for each mortality model is ${\bar{R}}_{m} \pm CD$ , where ${\bar{R}}_{m}$ is the mean rank of each model and $CD$ is the critical difference. We provide precise details in Section A.3 of Kessy et al. (Citation2021)

Additional information

Funding

This research is funded by the Australian Research Council Centre of Excellence in Population Ageing Research (CEPAR) project number CE110001029.

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 147.00 Add to cart

* Local tax will be added as applicable

Mortality forecasting using stacked regression ensembles

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Mortality forecasting using stacked regression ensembles

Abstract

Disclosure statement

Notes

Additional information

Funding

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature