229
Views
4
CrossRef citations to date
0
Altmetric
Original Articles

Forecasting mortality rates with the penalized exponential smoothing state space model

Pages 955-968 | Received 20 Jul 2020, Accepted 13 Feb 2021, Published online: 19 Mar 2021
 

Abstract

It is well known that accurate forecasts of mortality rates are essential to various demographic research like population projection, and the pricing of insurance products such as pensions and annuities. Recent studies suggest that mortality rates of multivariate ages are usually not leading indicators in mortality forecasting. Therefore, multivariate stochastic mortality models including the classic Lee–Carter may not necessarily lead to more accurate forecasts, compared with sophisticated univariate counterparties like the exponential smoothing state space (ETS) model. Despite its improved forecasting accuracy, the original ETS model cannot ensure the age-coherence of forecast mortality rates. By introducing an effective penalty scheme, we propose a penalized ETS model to significantly overcome this problem, with discussions on related technical issues including the reduction of parameter dimensionality and the selection of tuning parameter. Empirical results based on mortality rates of the Australian males and females suggest that the proposed model consistently outperforms the Lee–Carter and original ETS models. Robust conclusions are drawn when various forecasting scenarios are considered. Long-term forecasting analyses up to 2050 comparing the three models are further performed. To illustrate its usefulness in practice, an application to price fixed-term annuities with the penalized ETS model is demonstrated.

Acknowledgements

We are grateful to the Macquarie University for their support. The author would also like to thank Rob Hyndman, Jackie Li, Han Li, James Raymer and Chong It Tan for their helpful comments and suggestions. We particularly thank the Editor, Associate Editor and two anonymous referees for providing valuable and insightful comments on earlier drafts. The usual disclaimer applies.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 Among them, 11 cases are unstable and not preferred in practice (Hyndman et al., Citation2008).

2 For the trend component, an additional “damped” type (Gardner & McKenzie, Citation1985) is also considered for additive and multiplicative cases. Altogether, there are five different scenarios for trend: none, additive (damped) and multiplicative (damped).

3 Different from H. Li and Lu (Citation2017), we do not penalize αx and βx. One reason is that those parameters will be smoothed after applying the procedure described in Section 3.1. The other reason is that out-of-sample forecasts of lnmx,T+h do not directly depend on them. In other words, smoothed αx and βx will not necessarily enforce the smoothness of bx,T across x.

4 Our full sample covers the range over 1950–2016. As will be discussed, we focus on a 10-year test sample (2007–2016) to evaluate the forecasting performance, and the rest (1950–2006) is employed as the training sample. Hence, the in-sample fitted parameters discussed in this section are based on the training sample over 1950–2006.

5 The principle here is to balance the parsimony and accuracy in describing the structures of αx and βx. However, the optimal structures may change significantly when a penalty term is imposed. Thus, high-level precision in replicating the patterns of αx and βx from the unpenalized ETS model is not the focus. Therefore, we use 50% R2 as a basic criterion to perform the selection. Robust results (available upon request) are produced when other choices including 30%, 60% and 90% are used. In one of the robustness check (can be found in the supplemental online material), we demonstrate that our selection has similar results to the model without dimensionality reduction. A systematic study on the optimal criteria for such a reduction remains for future research.

6 The optimization can be performed with any usual numerical algorithms such as BFGS. In this paper, we adopted an effective and fast algorithm discussed in Ye (Citation1987) to conduct the minimization, which is realized in the Rsolnp package of the statistical software R.

7 The smoothed rates are produced using the weighted penalized regression splines with a monotonicity constraint. This is a standard method employed in the demography package of the statistical software R.

8 In order to check the sensitivity of this chosen forecasting horizon, we also consider a long range of 30 steps. The results are robust and can be found in section 2 of the online supplementary materials.

9 Comparing Figures 1 and 3, there are some marginal differences in αx and βx among males and females. The main trends, however, are similar across sexes. This is consistent with our selected values of nα and nβ.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 277.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.