1,574
Views
3
CrossRef citations to date
0
Altmetric
Feature Articles

Short- and Long-Term Dynamics of Cause-Specific Mortality Rates Using Cointegration Analysis

&

Abstract

This article applies cointegration analysis and vector error correction models to model the short- and long-run relationships between cause-specific mortality rates. We work with the data from five developed countries (the United States, Japan, France, England and Wales, and Australia) and split the mortality rates into five main causes of death (infectious and parasitic, cancer, circulatory diseases, respiratory diseases, and external causes). We successively adopt short- and long-term perspectives, and analyze how each cause-specific mortality rate impacts and reacts to the shocks received from the rest of the causes. We observe that the cause-specific mortality rates are closely linked to each other, apart from the external causes that show an entirely independent behavior and hence could be considered as truly exogenous. We summarize our findings with the aim to help practitioners set more informed assumptions concerning the future development of mortality.

1. INTRODUCTION

It is commonly known that mortality rates have been decreasing for many decades. Although a joyful development per se, these changes pose serious problems for insurance companies, pension funds, and social security schemes, because they need to know whether the observed decline will continue, slow down, or, speed up. In this work, we will not venture to forecast the prospective evolution of mortality rates but provide new insights on past developments. We believe that once we better understand the past, we will be able to make better prognoses about the future.

Numerous parametric models have been developed in order to take into account characteristics of mortality rate development such as age, year of birth, and rate of improvement. For a review thereof we direct the interested reader to Booth and Tickle (Citation2008), Cairns (Citation2013) and Debón, Montes, and Sala (Citation2006) and references therein. For our part, we want to gain additional insight into the past development of mortality rates by concentrating on a more detailed breakdown of mortality data, namely, by causes of death. Indeed, just from an eye inspection of the cause-specific mortality rates, it becomes clear that these rates showed strikingly divergent trends over the last 50 to 60 years. These phenomena have already been extensively studied and described (e.g., Himes Citation1994; Horiuchi and Wilmoth Citation1997; Costa Citation2005; Cutler, Deaton, and Lleras-Muney Citation2006).

However, it is much more difficult to integrate cause-specific mortality rates into a model, because they are dependent, and this dependence is, strictly speaking, not observable. Indeed, given a death event at a young age from an accident, for example, it is impossible to say what the chances of this person would be to die later from cancer or any other cause, had he or she remained alive. Among theories and methods trying to take into account the dependency structure between the cause-specific mortality rates one can cite models incorporating individual risk factors (e.g., Manton and Poss Citation1979; Manton, Stallard, and Trolley Citation1991), models employing multiple cause-of-death data (e.g., Manton and Myers Citation1987; Mackenbach et al. Citation1999); and, more recently, copulas (e.g., Lo and Wilke Citation2010; Dimitrova, Haberman, and Kaishev Citation2013).

Although possible theoretically, models that take into account the dependency between the causes of death are problematic to use in practice, because they require a significant amount of additional data that are not readily available. For this reason, the most widely used approach is still based on the assumption of independence between the causes of death that was developed more than 50 years ago (Chiang Citation1968). In this study, we want to look at the connections between the causes from a different angle. Without trying to describe exactly the dependency structure between the rates of death, we propose an approach based on cointegration analysis that complements the methods and practices mentioned above. In a nutshell, two nonstationary time series are said to be cointegrated if there exists such a linear combination of them that is stationary. Consequently, these time series are linked to each other in the long run and are subject to common stochastic trends. Cointegration analysis thus provides new insights on how cause-specific mortality rates depend from each other and interact in the long run.

Cointegration analysis was first introduced in the seminal paper of Engle and Granger (Citation1987) and received a lot of attention from researchers in the years that followed. Numerous tests allowing one to check for the existence of cointegrated relations between the time variables were developed, and those conceived by Søren Johansen (Citation1988) are among the most widely used. Cointegration analysis and the vector error-correction models (VECMs) based on it quickly became popular in the field of econometrics because they permitted establishing the long-run relationships between variables such as interest rates, consumption, income, etc. (e.g., Baillie and Selover Citation1987; Clarida Citation1992; Johansen and Juselius Citation1992).

To the best of our knowledge, cointegration analysis was first applied to the cause-specific mortality rates in Arnold and Sherris (Citation2013, Citation2015, Citation2016). We want to go further and extend the analysis by applying a wider range of cointegration and VECM tools to the cause-of-death mortality rates. We aim to identify new relationships and development patterns that were not covered by the abovementioned authors.

Namely, we want to understand the way the cause-specific mortality rates interact with each other. Using the additional tools offered by the VECMs, we study the short- and long-term impacts that a change in a particular death rate produces in other cause-specific mortality rates. Because we do not have prior knowledge about the precise way the cause-specific mortality rates interact, our study is exploratory in nature and provides new insight by observing the historical data from the perspective of cointegration analysis. Indeed, there are multiple ways in which the cause-specific mortality rates can impact each other. On the one hand, being subject to the same trends (e.g., improving health care systems, changes in nutrition and lifestyle), the cause-specific mortality rates can show similar responses and thus, be positively correlated. On the other hand, because of the presence of competing risks, the reduction in one cause-specific mortality rate will necessarily and at least partially be compensated by increase in other causes and thus, the cause-specific mortality rates will show negative correlation. In absence of a theoretical model for the relations between the causes, we think that the data analysis can reveal the end sum of such interactions.

At the same time, once a certain pattern is revealed in one country, it is impossible to say whether this pattern is a reflection of that country’s particularities or corresponds to some more fundamental processes and hence can be generalized to other countries and datasets. For this reason, we start with the gender-specific statistics of deaths by cause from five highly populated countries with similar socioeconomic characteristics and available observation periods (the United States, Japan, France, England and Wales, and Australia). Thanks to this approach, general common patterns are revealed in regard to the interaction existing between the causes of death. At a later point, our analysis could be extended to include other countries as well.

We see multiple ways in which our findings could be used in practice. First, the general patterns revealed by our approach can serve as a theoretical point of comparison for epidemiological studies on the joint development of cause-specific mortality rates due to particular factors; for example air pollution impacting not only respiratory but also circulatory mortality rates (Zmirou et al. Citation1998); sedentary behavior impacting both circulatory and cancer mortality rates (Matthews et al. Citation2012); body mass index providing contrasting effects on circulatory and respiratory mortality rates (Breeze et al. Citation2006); influenza vaccinations reducing all cause-specific mortality rates (Wang et al. Citation2007); heat waves impacting several cause-specific mortality rates at once (Rey et al. Citation2007; Basaga ∼ na et al. Citation2011) etc. In a similar way, results of such comprehensive assessments of cause-specific mortality rates, as the Global Burden of Disease Study (GBD Citation2013 Mortality and Causes of Death Collaborators 2013), can be confronted with those delivered by our model.

As previously mentioned, copula-based models are capable of taking into account the dependence between the cause-specific mortality rates. In the same time, copulas are, strictly speaking, not identifiable (Tsiatis Citation1975). For this reason, research articles usually present several copulas and play with different parameter values, because these choices can have a tremendous impact on the projection results (Dimitrova, Haberman, and Kaishev Citation2013; Li and Lu Citation2019). Efforts are made to narrow the set of possible parameters (Li and Lu Citation2019), and the question of how to estimate the correlations between the causes of death remains open (Dimitrova, Haberman, and Kaishev Citation2013). Our study provides a new basis that can be used to calibrate copula-based models because it shows explicitly the extent to which cause-specific mortality rates depend on each other.

Additionally, we contribute to the current discussion regarding whether a cause of death should be considered as endogenous or exogenous. In Arnold and Sherris (Citation2016), the authors observed that the results of the cointegration analysis paralleled the classification used by biologists and demographers between the exogenous and endogenous causes of death. Although this classification is not univocal, under the exogenous causes of death most researchers understand diverse external or environmental factors that cause death and the endogenous causes of death correspond to biological forces that lead to death (Carnes et al. Citation2006; Arnold and Sherris Citation2016). Because different views exist on this topic (Carnes and Olshansky Citation1997), we bring the discussion forward by showing that only the external causes can be classified as entirely exogenous, whereas this is not the case for the infectious and parasitic diseases.

We summarize our findings in a comprehensive form with the objective to help practitioners set more informed assumptions when designing scenarios of the possible future evolution of mortality by cause.

The article is organized as follows: In Section 2 we briefly present the data preparation process along with some theoretical notions of the cointegration analysis. Results from the impulse response analysis and short- and long-term dynamics of the cause-specific mortality rates are then presented in Section 3. Section 4 concludes.

2. DATA AND THE COINTEGRATION FRAMEWORK

2.1. Preparing the Data

We obtained the data for the present study from the WHO Mortality Database (World Health Organization [WHO] Citation2016), which contains the midyear population and the death numbers by country, year, sex, age group and cause of death as far back as 1950. Five developed countries were chosen for the analysis: the United States, Japan, France, England and Wales, and Australia (US, JP, FR, E&W, and AU, respectively).

To ensure consistency between the countries, the WHO defines the causes of death according to the International Classification of Diseases (ICD). This classification changed three times since the inception of the database, switching from ICD-7 to ICD-10 to account for advances in medical science and to refine the classification. We split the causes of death under each classification into five main groups: infectious and parasitic diseases (I&P), cancer, diseases of the circulatory system, diseases of the respiratory system, and external causes (). These groups account for approximately 70% to 80% of deaths in recent years and made up approximately 50% to 70% of deaths at the onset of the observations.

TABLE 1 Five Main Groups of Causes of Death According to Different Versions of the International Classification of Diseases

The data are divided into the following age groups: deaths at 0, 1, 2, 3, and 4 years and, then into five-year age groups: 5 to 9 years, …, 90 to 94 years, and 95 years and above. Having created two new age groups by grouping together the ages from 1 to 4 as well as 85 and above, we obtained the cause-specific mortality rates by the following transformations:

  1. Grouping the death numbers according to the five causal categories.

  2. Distributing the number of deaths at unspecified ages proportionally among known age groups.

  3. Calculating simple mortality rates as the number of deaths by age, sex, and cause of death divided by the mid-year population by age and sex: mx,t,d,s,c=dx,t,d,s,c/lx,t,s,c, where dx,t,d,s,cis the number of deaths at age x, in year t, for cause of death d, gender s and country c;lx,t,s,cisthe midyear population at age x, in year t, gender s, and country c;andmx,t,d,s,cis the central death rate at age x,in year t, for cause of death d, gender s, and country c.

  4. Applying the comparability ratios to ensure that the observations under the different versions of the ICD are comparable. A comparability ratio is defined in such a way that the average of the mortality rates over the last 2 years of a classification coincide with the average of the mortality rates over the first 2 years of the next classification. For the whole period under the observation, the mortality rates in a new classification are divided by the comparability ratios linking this classification with the previous one(s). In this way, we can smooth the mortality rates across the classifications and remove the discontinuities.

  5. Calculating the age-standardized central death rates, the standard population being equal to (1) the U.S. male population in 2007 and (2) the Japanese female population in 2009. In this manner, we ensure that the age structure of the population is the same for all countries and does not change over time. By using one relatively young (United States) and one relatively old (Japan) reference population, we can analyze whether the population age structure has an impact on the behavior of the cause-specific mortality rates. In total, we obtain 20 datasets: five countries, two genders, and two population structures.

    The age-standardized death rate mt,d,s,cUS in year t for cause d, gender s, and country c, assuming that the population age structure is constant over the whole observation period and is equal to the age structure of the U.S. males population in 2007 is calculated as follows: mt,d,s,cUS=dt,d,s,cUS/l2007,males,USAdt,d,s,cUS=xmx,t,d,s,c×lx,2007,males,USA.

    The age-standardized death rate mt,d,s,cJP in year t for cause d, gender s, and country c, assuming that the population age structure is constant over the whole observation period and is equal to the age structure of the JP females population in 2009 is calculated as follows: mt,d,s,cJP=dt,d,s,cJP/l2009,females,JPdt,d,s,cJP=xmx,t,d,s,c×lx,2009,females,JP

    Age-standardized death rates for selected years using the U.S. males population base are shown in in the Appendix.

  6. Taking the natural logarithm of the death rates. Hereafter we will work with the vector of time series yt for each gender s, country c, and population age structure p(US,JP): yt,s,cp=(log(mt,I&P,s,cp)log(mt,Cancer,s,cp)log(mt,Circulatory,s,cp)log(mt,Respiratory,s,cp)log(mt,External,s,cp)).

    To ease notation, we will sometimes omit the indexes c, s, and p and work with a vector of mortality rates yt=(y1t,y2t,y3t,y4t,y5t)T, keeping in mind that a separate VECM equation is formulated for each country, sex, and population age structure.

We thus use the same database as in Arnold and Sherris (Citation2016) except for the additional years of observations that we added whenever this was possible ().

When we started the current study, the WHO database provided the information on the mid-year population for the United States only until 2007, and for unknown reasons the data on Australian numbers of deaths for 2005 were also missing. As a consequence, we were obliged to limit the time series for these two countries to years 2007 and 2004, respectively.

As we will see in the following sections, the conclusions stated in Arnold and Sherris (Citation2016) were reconfirmed using the longer time series for Japan, France, and England and Wales.

TABLE 2 Observation Periods by Country

2.2. Cointegration Analysis in Application to the Cause-Specific Mortality Rates

As already mentioned, the causes of death are not independent. Cointegration analysis is then a tool that can help to better understand and model the dependence between the cause-specific mortality rates. As introduced in Engle and Granger (Citation1987), the time series yt that consists of the n nonstationary elements {yit}, for i=1,,n, are said to be cointegrated with a cointegrating vector β if a linear combination βyt is stationary: (1) β1y1t+β2y2t++βnynt=zt,(1) where zt is a stationary variable of stochastic deviations. In other words, while being non-stationary themselves, the cointegrated time series do not drift too far away from each other; that is there exists a long-run equilibrium relationship between them. In addition, there may be more than one cointegrating vector, so that β becomes a matrix. The variables are then linked to each other by several cointegration relations (CRs), and each relation is linearly independent from the others.

In Arnold and Sherris (Citation2015, Citation2016) the time series of all cause-specific mortality rates was found to be nonstationary and to have stochastic trends. It was also shown that at least one cointegrating relation existed between the causes of death in each country.

Multivariate dynamic systems of the nonstationary but cointegrated variables can then be modeled using a VECM, an extension of the vector autoregression (VAR) models, which includes not only the time dependency between the variables up to a lag p – 1 but also long-run equilibrium relations between them: (2) Δyt=c+dt+Γ1Δyt1+Γ2Δyt2++Γp1Δytp+1+Πyt1+ϵt,1(2) where Δyt=ytyt1 denote the first differences of the data time series, c and d are (n×1) vectors of constants, Γi is a (n×n) matrix of autoregressive coefficients for i=1,2,,p1, and Πyt1 represents the cointegrated term. The latter provides the model with the information on the long-run equilibrium between the variables that would otherwise be lost if a VAR model was applied to the differenced variables. The rank of the matrix Π corresponds to the number of cointegration relations.Footnote1

The first differences of the cause-specific mortality rates being stationaryFootnote2 (as verified in Arnold and Sherris Citation2016), EquationEquation (2) will only hold if the term Πyt1 is also stationary; that is, if the variables are cointegrated. Then the (n×1) vector ϵt is a vector of white noise termsFootnote3, with (3) E(ϵt)=0,(3) (4) E(ϵtϵl)={Ωfort=l0fortl,(4) where Ω is a symmetric positive definite matrix. More details on the VECM and VAR models can be found in Hamilton (Citation1994) and Lütkepohl (Citation2005).

The number of cointegrating relations, if any, can then be found using the trace and the maximum eigenvalue tests developed by Johansen (Citation1995). The Johansen approach also allows finding the matrix Π as (5) Π=αβ,(5) where β is a (n × r) matrix containing r vectors each representing a CR and α is a (n × r) loading matrix that indicates how a particular variable is impacted by the CR. Under the Johansen approach, we can also test for the form of the deterministic elements. Let μt=c+dt denote the deterministic part of the model and let d=αρ+αγ, where αα=0. Because the mortality rates are known to have a trend, we will consider the following forms of the deterministic elements (Johansen, Citation1995):

  • NT: no trend in the VECM but a linear trend in the levels of the variables: c0,ρ=0,γ=0,henced=0,

  • TC: linear trend in the CR combined with a linear trend in the levels of the variables (i.e., no linear trend in the differenced variables) c0,ρ0,γ=0,henced=αρ;

  • QT: linear trend in the differenced variables, thus the quadratic trend in the levels of the variables (i.e., the VAR model) c0,ρ0,γ0,henced=αρ+αγ.

In the tables that follow we will refer to the abbreviations NT, TC, and QT when describing the form of the deterministic elements chosen for a particular dataset.

Once the coefficients of the VECM model (EquationEquation 2) are defined, they allow us to assess the short- and long-term dynamics of the system. Indeed, the coefficients of the Γi matrices indicate whether and to what extent the cause-specific mortality rates interact in the short run. On the other hand, the analysis of the coefficients of the matrices α and β provide us with the information on the long-term relationships in the system.

In particular, the Johansen approach can be used to test whether every coefficient in the CR (i.e., in the matrix β) is significantly different from zero. If this is not the case, we can conclude that a particular variable does not participate in the long-run equilibrium. In Arnold and Sherris (Citation2016) it was found that in all countries and at least for one of the sexes the pair of mortality rates corresponding to I&P diseases and external causes did not appear significantly in the long-run equilibrium. The cointegration analysis hence showed that the long-term equilibrium relationship existed only between the mortality rates that could be classified as endogenous causes of death (cancer, circulatory, and respiratory diseases), and exogenous causes (I&P diseases, external causes) were excluded. Interestingly, this result coincides with the distinction used by biologists and demographers between the exogenous and endogenous causes of death. In this article, we will complement this study by analyzing first the short-term component and, second, the matrix α; that is, the impact that the CR has on a particular mortality rate.

2.3. Introducing the Lag of 2 to the VECM Setup

A usual step when working with a VECM setup is to define the lag order to be used in the VECM or the corresponding VAR model. Although in Although in Arnold and Sherris (Citation2016) the VAR models with the lag order of one were indicated as optimal using Akaike’s information criterion, Hannan-Quinn criterion, Schwarz criterion, and final prediction errorFootnote4 for some of the datasets, the corresponding model cannot be used to answer our research question. Indeed, in this case the VECM equation has no lagged values; consists only of the CR, errors, and eventual deterministic terms and implies that there is no connection between the first differences of the cause-specific mortality rates in the short run: (6) Δyt=c+dt+Πyt1+ϵt.(6)

Hence, in our case we need a VAR model with a lag order of at least two in order to have a full range of parameters in the VECM. Therefore, as a preliminary step, we decide to allow for the presence of the Γ1Δyt1 term on the right-hand side of the equation. This gives us the possibility to study the relative importance as well as the significance of the coefficients of the corresponding parameter matrix Γ1. Should some of the matrix Γ1 coefficients turn out to be significantly different from 0, we will be able to analyze the short-run adjustments of the cause-specific mortality rates.

The models that were chosen as best describing the datasets in Arnold and Sherris (Citation2016) comprising the VAR(2) models for some of the countries. To be able to make the full analysis of the short-run adjustments, we check whereas for every dataset we can find models with the lag order of two that would suitably describe the data.

First, we apply the Johansen approach to define the number of cointegration relations and the form of the deterministic elements and then we test the residuals of the fitted VECM. The models suggested by the Johansen approach are shown in the of the Appendix (second column). These are the models that will be used in the subsequent analysis of the short- and long-term dynamics of the cause-specific mortality rates. Further columns contain the results of the tests on the residuals of the fitted VECM. The overall fit is similar to that of the models proposed in Arnold and Sherris (Citation2016) except for the lower fit for the Japanese datasets. In addition, it was not possible to find any suitable VAR(2) model for the E&W females with the JP females population structure, so in the following sections we will use 19 datasets instead of 20.

TABLE 5 Impulse Response Analysis: Response of the Mortality Rate Y to the Shock Given to the Cause X, High-Level Summary across All Countries, Sexes, and Population Structures

For these new models, we also need to check the significance of the β matrix coefficients. As we can see in and , for 15 out of 19 considered datasets the cause-specific mortality rates corresponding to the causes I&P and external do not appear significant (at a 1% significance level) in the long-term steady states, which confirms the conclusion made in Arnold and Sherris (Citation2016).

TABLE 3 p Values for the Null Hypothesis That the I&P and the External Causes of Death Are Not Significantly Different from Zero, U.S. Males Population Base, VAR(2) Models

TABLE 4 p Values for the Null Hypothesis That the I&P and the External Causes of Death Are Not Significantly Different from Zero, Japanese Females Population Base, VAR(2) Models

As already mentioned, the VAR(2) models indicated in and will be used in the analysis that follows in Section 3.

3. DYNAMICS OF THE CAUSE-SPECIFIC MORTALITY RATES

In the following sections, we present detailed analysis for the two datasets U.S. and Japanese males using the U.S. males population structure. We summarize the most interesting findings and provide the details in the Appendix for the remaining 17 datasets.

3.1. Impulse-Response Analysis

First, to get a high-level overview of interactions between the cause-specific mortality rates (as described by the VECM equations in the preceding section), we apply the framework of impulse response analysis (see, e.g., Lütkepohl Citation2005,). At this point, we do not differentiate between the short- and long-term elements of the VECM and analyze the system as a whole. More detailed analysis of the short- and long-term components including the statistical significance of the parameters will follow.

Basically, impulse response analysis means that we first give a single shock to one cause-specific mortality rate and then analyze and compare the responses to this shock from every other cause-specific mortality rate. In this way, we study the impact of an unexpected change in a particular mortality rate on the dynamics of the system of mortality rates as observed in the past. The initial value taken by the variable that receives the shock is equal to its own standard deviation.

When analyzing the results, we successively adopt two points of view. First, we compare the impacts that a particular cause induces on other cause-specific mortality rates. Then, we compare the responses of a particular cause to the individual shocks received from the rest of mortality rates. In this way, we are able to determine not only whether a particular cause influences the others and to what extent but also whether it is influenced by the rest of the causes and to what extent.

Once a shock is given to a particular cause-specific mortality rate, it propagates in the system and confers new values to the rest of the variables. This development can be suitably exposed on a chart. For example, shows the responses of every cause-specific mortality rate to the shock given to the circulatory mortality rate for U.S. males with the U.S. males population structure (standard deviation of the differenced circulatory mortality rate = 0.0235). Overall, the I&P and respiratory mortality rates show the most important reactions to the shock given to the circulatory mortality rate. The cancer and external mortality rates are insignificantly impacted by the shock given to the circulatory mortality rate. As for the circulatory mortality rate itself, after having received the initial shock, it maintains the increased value until the end of the simulation period.

FIGURE 1. Responses to the Shock Given to the Circulatory Cause, U.S. Males, U.S. Males Population Base.

FIGURE 1. Responses to the Shock Given to the Circulatory Cause, U.S. Males, U.S. Males Population Base.

The responses to the shocks from the circulatory cause observed in the dataset for Japanese males with the U.S. males population structure are shown on (standard deviation of the differenced circulatory mortality rate = 0.0501). Similar to the U.S. males dataset, the respiratory cause shows the most important response from the shock given to the circulatory mortality rate. The response of the I&P mortality rate is slightly less important than that of the respiratory rate. Interestingly, both responses have a negative sign, whereas in the U.S. males dataset they also have the same sign but a positive one. One further observation for the Japanese males dataset is that the external causes also show a non-negligible response to the shock from the circulatory cause.

FIGURE 2. Responses to the Shock Given to the Circulatory Cause, Japanese Males, U.S. Males Population Base.

FIGURE 2. Responses to the Shock Given to the Circulatory Cause, Japanese Males, U.S. Males Population Base.

We see that in both cases the system comes rather quickly to a new equilibrium. Because the same observation holds for the rest of the datasets, we will compare the responses following individual shocks at time t = 20 years.

In and the responses are shown in absolute values. However, because the cause-specific mortality rates have different standard deviations, each system receives a shock of a different amplitude. As such, the responses are not comparable between the datasets; that is, a response that would be considered high in one dataset can be considered as medium or low in another dataset. To bring the results to the same comparable basis, we will divide the absolute responses by the standard deviation of the cause-specific mortality rate that receives the shock. Then the response of the respiratory cause to the shock from the circulatory cause; that is the value of the cause-specific respiratory mortality rate at time t = 20 will be expressed as a proportion of the shock received by the system; that is of the standard deviation of the cause-specific circulatory mortality rate.

The results for every dataset are shown in in the Appendix. For the sake of readability, along with a numerical value we provide a label that indicates whether the response can be considered as low, medium, or high: low if |response|<3/8, medium if 3/8|response|<7/8, and high if 7/8|response|. These labels are indicative only and were chosen to provide a roughly equal number of medium and high responses (40% of all responses); the rest are attributed to the category low (60% of all responses). The tables are organized as follows: Each row contains responses of all causes to the shock given to the cause X, and each column contains responses of the cause Y to the shocks from all causes. In this way, we can judge simultaneously whether a particular cause impacts each of the remaining causes and whether it reacts to the shocks received from other causes and to what extent.

TABLE 6 p Values for the Null Hypothesis That αi Is Not Significantly Different from Zero, U.S. Males Population Structure

A synopsis of the observations summarized across the 19 datasets is presented in .

In a nutshell:

  • The I&P and respiratory causes have virtually no impact on all other causes, but show important responses to the shocks received from them.

  • The cancer and circulatory mortality rates have an important impact on other causes, especially on the I&P and respiratory mortality rates, but show little response to the shocks from other causes.

  • The external causes have an equivocal behavior. On the one hand, they have almost no impact on the cancer and circulatory causes but importantly impact the I&P and respiratory causes. On the other hand, they are not impacted by the I&P and respiratory causes but show important responses to the shocks from the cancer and circulatory causes.

This first analysis shows that the cause-specific mortality rates show different behaviors. At the same time, when a system is analyzed as a whole, many effects are necessarily blended. Therefore, we need to decompose our analysis by separately assessing the short- and long-term dynamics of the system of the mortality rates in order to understand better how the causes of death are related to each other. In the following subsections, we will see what drivers in particular lie behind the observed development of the cause-specific mortality rates.

3.2. Short-Term Dynamics

Once the VECM equations are estimated for each dataset, we can use them to separate the short-term adjustments from the long-term dynamics for each cause-specific mortality rate. Indeed, if a particular coefficient γij of matrix Γ1 is significant, then the cause i is influenced by the cause j in the short run. We calculate the standard deviations and the corresponding t-ratios of the estimates as shown in Lütkepohl, Citation2005).

We start with the dataset for U.S. males with the U.S. males population structure. In the preceding section the following model was chosen as best describing this dataset: Δyt=c+dt+Γ1Δyt1+Πyt1+ϵt=[2.77160.15000.24282.05730.1558]+[0.00960.00000.00000.00500.0000]t+[0.12120.70740.18790.17430.32320.00360.01500.16590.00820.13290.04290.12110.03370.08920.19630.13450.29440.08500.38101.11890.04160.32320.19810.14550.2267]Δyt1+[0.03310.00200.00300.02570.0019][1.77165.498518.601513.216714.1321]yt1+ϵt.

The significant coefficients are in bold with the selected significance level of 5%. Though many of the Γ1 coefficients are not significant, the cause-specific mortality rates from cancer, respiratory, and external causes are influenced by the lagged values of circulatory and external, respiratory and external, and respiratory causes, respectively. We see that in this dataset three out of five cause-specific mortality rates experience the short-term adjustments from other causes. Hence, it was justified to use the VAR(2) setup and include the lagged values of Δyt into the model. Otherwise, an essential piece of information on the development of the cause-specific mortality rates would not have been accounted for. Another interesting observation is that only the respiratory mortality rate shows the autoregressive feature. In other words, the corresponding cause-specific mortality rate is dependent on the lagged value of itself.

For the dataset of Japanese males with the U.S. males population structure, the chosen VECM has two cointegration relations with a constant and a trend, the latter being restricted to the cointegration term: Δyt=c+Γ1Δyt1+αβ(yt1+(t1))+ϵt=[0.39600.85390.76001.82520.7695]+[0.16790.55100.19820.03350.24400.01430.38180.02680.01840.04060.13430.75310.17110.16920.00550.14772.11791.31080.55370.14870.03070.30480.28980.13050.0906]Δyt1+[0.02610.36000.01860.04250.01260.03990.00540.84170.01540.0149][[1.59517.70550.98761.58221.84541.08489.317210.14736.66303.7839]yt1+[0.18510.3817](t1)]+ϵt. Also for this dataset, many of the Γ1 coefficients are not significant. On the other hand, the cause-specific mortality rates corresponding to the causes cancer, circulatory, and respiratory causes, are influenced by the lagged values of the cancer, respiratory, and circulatory and respiratory mortality rates respectively. Again, three out of five cause-specific mortality rates, experience the short-term adjustments from other causes. Therefore, it would not be justified to use the VAR(1) setup for Japanese males with the U.S. males population structure. Similar to the U.S. males dataset, the respiratory cause shows the autoregressive feature, as does the cancer cause.

After the analysis was repeated for the rest of the datasets using both the U.S. males and Japanese females population structures, the results can be summarized as follows:

  • In every dataset, there is at least one cause-specific mortality rate that is significantly impacted by other causes in the short run. For this reason, it would not be optimal to use VAR models with the lag order one instead of two.

  • Though in the short run the I&P and cancer causes are rarely impacted by other causes, they also infrequently impact the rest of the causes; that is they show a development mostly independent of other causes in the short run.

  • On the other hand, the circulatory, respiratory, and external causes are frequently impacted by one or more causes in the short run and also occasionally impact other causes. Hence, these cause-specific mortality rates are more linked in their development to other causes than the I&P and cancer mortality rates are.

  • The respiratory cause consistently shows the autoregressive feature. In other words, in many datasets the corresponding cause-specific mortality rate is dependent on the lagged value of itself.

  • For all datasets, the larger part of the significant coefficients are negative; that is more often than not the change in the cause-specific mortality rate goes in the opposite direction of the short-term variation of this and/or other cause-specific mortality rates at the previous point in time. More specifically, this means that if the mortality rate of a particular cause of death increases (decreases), the other causes will tend to decrease (increase) in the short run.

A detailed overview of the significant coefficients in Γ1 matrix for each dataset is presented in in the Appendix.

3.3. Long-Term Dynamics

The α matrix allows us to estimate how deviations from the steady-states impact the cause specific mortality rates. For r= 1 (which is the case for the majority of the datasets) we can write the long-term component as follows: Πyt1=αβyt1=[α1α2α3α4α5][β1β2β3β4β5]yt1=[α1α2α3α4α5](β1y1t1+β2y2t1+β3y3t1+β4y4t1+β5y5t1).

This way, if a particular coefficient αi is significant, the long-term component on the right-hand side of EquationEquation (2) is important in explaining the past variations in the corresponding cause-specific mortality rate on the left-hand side. Moreover, the value of this coefficient shows the extent to which the long-term component contributes to the variation in the cause-specific mortality rate in question. As in the previous subsection, we calculated the standard deviations and the corresponding t-ratios of the estimates of α as shown in Lütkepohl (Citation2005). shows the p-values for the U.S. males and Japanese males datasets using the U.S. males population structure.

On the one hand, this example shows that for the U.S. males dataset the long-term component enters the equations for the I&P and respiratory mortality rates with a significant coefficient (at a 5% significance level). On the other hand, the equations for cancer, circulatory, and external mortality rates are not significantly impacted by the long-term equilibrium. For Japanese males, there are two long-term components that each enter with a significant coefficient the equations for the I&P and cancer mortality rates; in addition, the second component enters with a significant coefficient the equation for the respiratory mortality rate.

We repeated the analysis for the rest of the datasets, and the results can be summarized as follows:

  • The I&P and respiratory causes seem to be the most impacted by the long-term component: The corresponding αi coefficients are significant in 15 out of 19 datasets. A similar observation was made using the framework of the impulse response analysis, because there the I&P and respiratory mortality rates showed an important reaction to the shocks from other causes. Hence, these shocks propagate in the system via the CR(s).

  • The external causes seem to be the least impacted: The corresponding αi coefficients are significant in only 5 out of 19 datasets. Interestingly, the results of the impulse response analysis for the external causes were equivocal in that there was an important reaction to the shocks from the cancer and circulatory causes, but a low response to the shocks from the I&P and respiratory causes. The impact from the cancer and circulatory causes may hence come from the short-term adjustments.

  • Results for the cancer and circulatory causes are more difficult to interpret: The corresponding αi coefficients are significant in respectively 9 and 11 out of 19 datasets. The results of the impulse response analysis also showed low reactions of the cancer and circulatory mortality rates to the shocks from other causes.

Interestingly, though the I&P and external causes do not participate conjointly in the long-term equilibrium, they show different behaviors when it comes to the impact they experience from this long-term steady state. Indeed, the cointegration relations often enter the equation for the I&P mortality rate with a significant coefficient but seldom have an effect on external causes. Therefore, only the external causes show behavior that is entirely independent from the long-term equilibrium state and, possibly, aging.

An overview of the results for the remaining datasets is shown in in the Appendix.

3.4. Long-Term versus Short-Term Dynamics

In the previous sections, we have analyzed the short- and long-term elements separately. Now we want to assess the relative importance of the long- and short-run forces. For this purpose, we break down the expected cause-specific mortality rates at time t, based on the information available up to time t–1, in two elements: the short-term (ST) and long-term (LT) components. By comparing the behavior of each of these elements with the realized change in the mortality rates, we assess the relative importance of the long- and short-run forces in terms of their contribution to the variation of the cause-specific mortality rates.

For illustrative purposes, we present the results for the respiratory equation for U.S. males () and the I&P equation for Japanese males (), with both datasets using the U.S. males population structure. In the first case, the actual mortality changes fluctuate primarily with the short-term components (the correlation coefficient between ΔRespt and LT: 0.268, between ΔRespt and ST: 0.336).

FIGURE 3. Respiratory Cause: Actual Mortality Changes, Long- and Short-Term Components (U.S. Males, U.S. Males Population Structure).

FIGURE 3. Respiratory Cause: Actual Mortality Changes, Long- and Short-Term Components (U.S. Males, U.S. Males Population Structure).

Figure 4. I&P Cause: Actual Mortality Changes, Long-Term and Short Term Components (Japanese Males, U.S. Males Population Structure).

Figure 4. I&P Cause: Actual Mortality Changes, Long-Term and Short Term Components (Japanese Males, U.S. Males Population Structure).

For Japanese males, the actual mortality changes fluctuate primarily with the long-term component (the correlation coefficient between ΔIPt and LT: 0.570, between ΔIPt and ST: –0.177).

Because not every equation contains the long-term component, the results for the rest of the datasets are formulated for those cases where the long-term component is present with a significant coefficient αi.

  • Out of 15 datasets for which the I&P mortality rate equation contains the long-term component, in 13 cases the data fluctuate primarily with the long-term component (i.e., the correlation coefficient between the data points and the long-term component is higher than that between the data points and the short-term component).

  • Out of 9 datasets for which the cancer mortality rate equation contains the long-term component, in 8 cases the data fluctuates primarily with the long-term component.

  • Out of 11 datasets for which the circulatory mortality rate equation contains the long-term component, in 4 cases the data fluctuate primarily with the long-term component.

  • Out of 15 datasets for which the respiratory mortality rate equation contains the long-term component, in 7 cases the data fluctuate primarily with the long-term component.

  • Out of 5 datasets for which the External mortality rate equation contains the long-term component, only in 1 case does the data fluctuate primarily with the long-term component.

Summarizing the results stated above, we can say that every time the equation contains the long-term component, the cause-specific mortality rate resembles in its behavior the long-term component rather than the short-term one for the causes I&P and cancer. The opposite is true for the circulatory and external causes. The respiratory mortality rate resembles in its behavior the long-term component as often as it resembles the short-term component.

This observation is not surprising for the I&P and respiratory causes. As we have seen in previous sections, only these cause-specific mortality rates are often impacted by the long-term equilibrium state. At the same time, the I&P mortality rate is rarely impacted and the respiratory mortality rate is frequently impacted by the short-term component. It could have been expected that the I&P mortality rate data will fluctuate with the long-term component in the majority of cases where the I&P mortality rate contains the CR(s). In its turn, the respiratory mortality rate fluctuates either with the short-term component or with the long-term component in roughly similar proportions. Therefore, the correlation analysis reinforces the conclusions of the previous sections for these two rates.

A similar conclusion holds for the circulatory mortality rate: Because it is frequently impacted in the short run and only occasionally in the longrun, the short-term components play a more important role in the correlation analysis.

Regarding the cancer mortality rate, we have seen that it was infrequently impacted by both short- and long-term components. Because none of the effects can be called dominant, the correlation analysis helps to identify the component that plays a more important role in the development of this cause-specific mortality rate, in this case, the long-term.

As for the external causes, the corresponding cause-specific mortality rate is often impacted by the short-term component and virtually never by the long-term one. Even in those rare cases in which the CR enters the equation with a significant coefficient, the data fluctuate more with a short-term component.

The detailed results for each dataset are shown in in the Appendix.

4. CONCLUSIONS

The analysis of dynamics of the cause-specific mortality rates shows that they are dependent on each other in both short- and long-run. Although the observed experience will never exactly repeat itself in the future, the following observations can help practitioners set more informed assumptions on the future development of mortality rates:

  • The common long-run trend shared by the cause-specific mortality rates is contingent on the evolution of the cancer, circulatory, and respiratory mortality rates, because these are the causes that significantly contribute to the CR between the mortality rates.

  • Once the common long-run trend is defined, it more heavily impacts the development of the I&P and respiratory mortality rates and to a lesser extent the development of the cancer and circulatory mortality rates. The external causes are exempt from the influence of the common long-term relationship between the causes.

  • In the short run, the respiratory mortality rate consistently shows the autoregressive feature.

  • Although the short-run dependencies are more challenging to model, they are significantly pronounced in the development pattern of the circulatory, respiratory, and the external mortality rates. In other words, these rates are dependent on each other in the short run.

Coming back to the conclusion made in Arnold and Sherris (Citation2016) that the I&P and external causes do not participate conjointly in the long-term steady state, we see that these causes differ in the way they are impacted by the long-term equilibrium. Though the I&P mortality rate is often impacted by the CR(s) and, when it is, fluctuates more with the long-term component, the external causes show the opposite behavior: The corresponding rate is almost never impacted by the CR(s), and, when it is, it fluctuates more with the short-term component.

We see that the development of the external causes mortality rate is completely independent of the long-term equilibrium in terms of both the contribution to, and influence experienced from the steady state. This is a behavior of what could be called a genuinely exogenous cause of mortality because we observe no long-term impact to or from this cause. It develops in a way that is entirely independent of the observed equilibrium between the rest of the cause-specific mortality rates and is subject to only short-term shocks from other causes. Basically, this observation is not surprising, because under the external causes are grouped causes such as transport and other accidents (falls, poisoning, accidental fire, drowning), suicides, homicides, and war injuries. Therefore, it is rather difficult to imagine a link connecting these mortality rates to the rest of the mortality causes that could be observed over a long time. On the contrary, these causes of mortality can be characterized by randomness and “bad luck” rather than by a steady long-term development.

In turn, the I&P mortality rate does not influence the long-term equilibrium observed between the cause-specific mortality rates but is rather sensitive to the impacts received from this equilibrium. Occasionally, it is also subject to short-term shocks from other causes. We can conclude that though the evolution of the I&P mortality rate does not influence the development of other cause-specific mortality rates—that is a sudden increase or drop in the I&P mortality rate will not affect the rest of the cause-specific mortality rates—its own development depends to a great extent on other causes of death, especially in the long run. Such behavior cannot be described as fully independent, so the I&P cause cannot be classified as a truly exogenous cause in the same way the external causes can.

These observations are consistent with the intuition that the biological processes of aging are reflected in the common stochastic trend shared by the cause-specific mortality rates. Indeed, though it can seem that the infectious or parasitic diseases are similar to the external causes in that the origin of the force affecting the human body lies outside, the underlying biological processes are more complicated, because human beings are not equal when they face an infection. Even during severe epidemics, the probability of getting sick and dying depends to a large extent on the internal immune forces of the individual, which, in turn, depend on, among other factors, age. A well-known example is influenza, which is most dangerous for the elderly. When advancing in age, we are increasingly confronted with competing risks such that a decrease in mortality from circulatory diseases, for example, would leave more vulnerable persons alive who could then die from an infectious disease during an epidemic. It is then understandable that the I&P mortality rate, though not a part of the long-term equilibrium, is substantially affected by it. Our results then confirm and reinforce the link between the cointegration relations observed within a set of cause-specific mortality rates and the biological processes of aging.

One further possible application of the present study is the calibration of copula-based models for the cause-specific mortality rates, which remains an open question. Indeed, because of the indentifiability issue raised by Tsiatis (Citation1975), one usually has to assume that the dependence is represented by a known copula with known parameters. At the same time, copula-based models are highly sensitive with respect to these choices (Dimitrova, Haberman, and Kaishev Citation2013). As pointed out in Kaishev, Dimitrova, and Haberman (Citation2007), the free parameters could be set according to a priori available (medical) information, about the degree of pairwise dependence between the two competing risks, expressed through Kendall’s τ and/or Spearman’s ρ”. In the absence of additional information and to demonstrate the sensitivity of the results, the authors use four different copulas and five different values of Kendall’s τ, ranging from –0.91 (extreme negative dependence) to 0.91 (extreme positive dependence). In Li and Lu (Citation2019), the authors go further and by introducing hierarchical Archimedean copulas succeed in building a model that allows for different levels of association between the causes of death. For this, they group the causes in different clusters based on the (assumed) level of dependence between the causes, but also admit that the introduced hierarchical structure is not unique. Although our study cannot provide an exact value of parameters to be used in copula-based models, a certain degree of pairwise association (correlation) between the causes of death can be inferred from the results of the impulse response analysis (section 3.1). This could help researchers working with copula-based models further reduce the possible range of free parameters that otherwise have to be chosen arbitrarily. In addition, the revealed differences in the long- and short-term development of the cause-specific mortality rates can serve as the basis for building clusters of the causes.

In the current study, we limited our analysis to the total cause-specific mortality rates and did not differentiate by age. Yet, it is intuitively clear that when analyzed by age the mortality rates will present different development patterns. Because the cause-specific death numbers are available in the WHO database by 5-year age groups, it seems to be a promising path to integrate the age specifics of the mortality rates into the modeling process. However, this remains a challenging task because as, on the one hand, the cointegration tests have been developed for systems with a maximum 12 variables (Osterwald-Lenum Citation1992) and, on the other, the observation horizon, which goes back as far as 1950, is also rather brief. In our opinion, analysis trying to overcome these difficulties while preserving the information on the age profile has the potential to deliver additional insights on the interaction of the cause-specific mortality rates. Moreover, the biological processes of aging may probably be easier to measure once the data on young ages are excluded from the analysis, because by definition the aging risk factor becomes more important the longer we live. For this reason, the analysis of the cause-specific mortality rates excluding young ages may provide a better measure of the aging process.

Discussions on this article can be submitted until January 1, 2023. The authors reserve the right to reply to any discussion. Please see the Instructions for Authors found online at http://www.tandfonline.com/uaaj for submission instructions.

ACKNOWLEDGMENT

The authors thank an anonymous referee for the useful comments that helped to improve the article. All errors are our responsibility.

FUNDING

Viktoriya Glushko and Séverine Arnold gratefully acknowledge the financial support received from the Swiss National Science Foundation for the project “Cause-Specific Mortality Interactions” (Project Number 100018_162898).

Additional information

Funding

Viktoriya Glushko and Séverine Arnold gratefully acknowledge the financial support received from the Swiss National Science Foundation for the project “Cause-Specific Mortality Interactions” (Project Number 100018_162898)

Notes

1 The corresponding VAR model has p lags: yt=c+dt+ξ1yt1+ξ2yt2++ξpytp+ϵt..

2 This observation applies to the studied times series only and should be verified for other countries and time periods.

3 We use the Gaussian setting because this allows us to use the cointegration analysis to model the dependence between the cause-specific mortality rates. Other models such as Poisson autoregression are naturally conceivable and would bring a different and complementary perspective to the analysis when modeling death counts.

4 These criteria are based on the natural logarithm of the determinant of the estimate of the residual covariance matrix Σ̂ϵ=1Tt=1Tϵ̂tϵ̂t, where T is the number of observations in the time series, to which a penalty for the number of parameters is added. Though in the general case it is important to pay attention to the parsimony of the model, the weight of the penalty is to some extent arbitrary and can vary depending on the objectives of the study.

REFERENCES

APPENDIX

TABLE A.1 Age-Standardized Central Death Rates for Selected Years, U.S. Males Population Base, ×103

TABLE A.2 Tests on Residuals of the Fitted VECM, Males, U.S. Males Population Base, VAR(2) Models

TABLE A.3 Tests on Residuals of the Fitted VECM, Females, U.S. Males Population Base, VAR(2) Models

TABLE A.4 Tests on Residuals of the Fitted VECM, Males, Japanese Females Population Base, VAR(2) Models

TABLE A.5 Tests on Residuals of the Fitted VECM, Females, Japanese Females Population Base, VAR(2) Models

TABLE A.6 Impulse Response Analysis: Response of the Mortality Rate Y from the Shock Given to the Cause X, U.S. Males Population Base

TABLE A.7 Impulse Response Analysis: Response of the Mortality Rate Y from the Shock Given to the Cause X, Japanese Females Population Base

TABLE A.8 Γ1 Coefficients That Are Significantly Different from Zero, Significance Level of .05, Males, U.S. Males Population Base

TABLE A.9 Γ1 Coefficients That Are Significantly Different from Zero, Significance Level of .05, Females, U.S. Males Population Base

TABLE A.10 Γ1 Coefficients That Are Significantly Different from Zero, Significance Level of .05, Males, Japanese Females Population Base

TABLE A.11 Γ1 Coefficients That Are Significantly Different from Zero, Significance Level of .05, Females, Japanese Females Population Base

TABLE A.12 Equations to Which the Long-Term Component Enters with a Statistically Significant Coefficient, αi, Significance Level of .05, U.S. Males Population Base

TABLE A.13 Equations to Which the Long-Term Component Enters with a Statistically Significant Coefficient, αi, Significance Level of .05, Japanese Females Population Base

TABLE A.14 Correlation Coefficients between the Actual Changes in Mortality Rates and the Long- and Short-Term Components, Males, U.S. Males Population Base

TABLE A.15 Correlation Coefficients between the Actual Changes in Mortality Rates and the Long- and Short-Term Components, Females, U.S. Males Population Base

TABLE A.16 Correlation Coefficients between the Actual Changes in Mortality Rates and the Long- and Short-Term Components, Males, Japanese Females Population Base

TABLE A.17 Correlation Coefficients between the Actual Changes in Mortality Rates and the Long- and Short-Term Components, Females, Japanese Females Population Base