110
Views
15
CrossRef citations to date
0
Altmetric
Original Research

Describing the association between socioeconomic inequalities and cancer survival: methodological guidelines and illustration with population-based data

, , , , , , , , & show all
Pages 561-573 | Published online: 17 May 2018

Abstract

Background

Describing the relationship between socioeconomic inequalities and cancer survival is important but methodologically challenging. We propose guidelines for addressing these challenges and illustrate their implementation on French population-based data.

Methods

We analyzed 17 cancers. Socioeconomic deprivation was measured by an ecological measure, the European Deprivation Index (EDI). The Excess Mortality Hazard (EMH), ie, the mortality hazard among cancer patients after accounting for other causes of death, was modeled using a flexible parametric model, allowing for nonlinear and/or time-dependent association between the EDI and the EMH. The model included a cluster-specific random effect to deal with the hierarchical structure of the data.

Results

We reported the conventional age-standardized net survival (ASNS) and described the changes of the EMH over the time since diagnosis at different levels of deprivation. We illustrated nonlinear and/or time-dependent associations between the EDI and the EMH by plotting the excess hazard ratio according to EDI values at different times after diagnosis. The median excess hazard ratio quantified the general contextual effect. Lip–oral cavity–pharynx cancer in men showed the widest deprivation gap, with 5-year ASNS at 41% and 29% for deprivation quintiles 1 and 5, respectively, and we found a nonlinear association between the EDI and the EMH. The EDI accounted for a substantial part of the general contextual effect on the EMH. The association between the EDI and the EMH was time dependent in stomach and pancreas cancers in men and in cervix cancer.

Conclusion

The methodological guidelines proved efficient in describing the way socioeconomic inequalities influence cancer survival. Their use would allow comparisons between different health care systems.

Introduction

Assessing the relationship between socioeconomic deprivation and cancer survival is important as socioeconomic differences in cancer survival are still observed even in countries with universal health care coverages.Citation1Citation6 Describing this relationship at the population level calls for population-based cancer registry data, but the way of performing the analysis is challenging. Indeed, several methodological conditions should be met: 1) the use of a relevant measure of deprivation, usually ecological (ie, defined at an area level) as individual level of deprivation is not routinely collected in population-based data; 2) the use of a relevant mortality indicator such as the excess disease-specific mortality among cancer patients vs noncancer subjects; 3) the use of a regression model able to deal first with nonlinear functional forms of the association between continuous prognostic factors and the excess mortality (eg, between the deprivation index and the excess mortality throughout the range of the deprivation index values) and then deal with time-dependent associations (eg, nonconstant association between the deprivation index and the excess mortality over the time elapsed since diagnosis); 4) the use of an appropriate method for statistical inference that accounts for the statistical dependency between patients who share similar characteristics because they live in the same area across which the ecological deprivation index is defined (this is especially important when the interest lies in estimating regression parameters associated with the ecological deprivation variable);Citation7Citation9 and 5) the use of a measure that summarizes the “importance” of the cluster level on the Excess Mortality Hazard (EMH).

Several recent ecological population-based studies addressing the question of the association between social deprivation and cancer survival were found in the literature.Citation4Citation6,Citation10Citation12 Disease-specific mortality was generally the outcome of interest, usually estimated using an excess mortality approach, while cause-specific mortality approach was also used (using the cause of death). Statistical methods relied on either nonparametric or parametric regression modeling approaches. When modeling approaches were used, the adopted deprivation index was considered as a categorical variable (quintiles). When regression models included the age at diagnosis as a prognostic factor, it was most of the time considered as a categorical variable. Finally, some studies explored the time-dependent associations between the variables and the hazard but it is not the rule, and none had taken into account the hierarchical structure of the data.

The present work proposes methodological guidelines for addressing the above-mentioned challenges (1–5) and illustrates their use through an investigation of the association between socioeconomic deprivation and cancer survival from solid tumor cancers up to 10 years after diagnosis. It is worth noticing that, although the article uses the term “effect” in a few places, no causal association is implied.

Materials and methods

Data

We used population-based cancer registry data that cover two contiguous Départements [French administrative areas] of West France (Calvados and Manche, nearly 1.1 million inhabitants). The quality and exhaustiveness of the included reg istries are certified every 4 years by an audit of the National Institute of Health and Medical Research (INSERM), the “Santé Publique France” agency, and the French National Cancer Institute. The incidence data from those registries are regularly included in the “Cancer Incidence in Five Continents” monograph series of the International Agency of Research on Cancer, where their quality and exhaustiveness are also assessed.

We analyzed cancer cases diagnosed between 1997 and 2010 in people aged >15 years at diagnosis. The follow-up of all cases ended on June 30, 2013. The 17 cancers under study are displayed in Tables S1 and S2.

The data from these registries are not publicly available. We analyzed these data under the ethical approval obtained by each registry from the French institute “Commission Nationale de l’Informatique et des Libertés” (“998018” for the Calvados digestive cancer registry, “981001 V1” for the Calvados general cancer registry, and “912669” for the Manche cancer registry).

The measure of social deprivation

Because individual levels of deprivation are not routinely collected, ecological measures defined at area levels have been proposed.Citation13,Citation14 These measures are considered as good proxies of individual deprivation in relatively small areasCitation15 and measure additionally the patients’ social and economic environment (“contextual variables”).Citation7,Citation16,Citation17

The European Deprivation Index (EDI) was developed using information from the European Union Statistics on Income and Living Conditions (EU-SILC) survey as well as other country-specific information.Citation18 The ultimate goal of this index is to have in each European country an ecological deprivation index based on (country-specific) census variables using the same methodological approach for its construction while accounting for cultural and social specificities of each European country. The approach relies on the concept of relative deprivation, first proposed by the sociologist Peter Townsend.Citation19 Deprivation refers to unmet fundamental needs caused by the lack of resources of all kinds (not only financial), those fundamental needs differing between societies (thus “relative” as it refers to deprivation specifically for a given society). Individuals can be said deprived when they lack the resources to obtain those types of needs (diet, type of living conditions, amenities, or services), which are obtained by the majority of people in the societies to which they belong to.

The EU-SILC is organized every year in every country of the EU-28. Based on a representative panel of European household, individuals answer some detailed questions on their living condition in each country. The construction of the EDI can be summarized as follows: First, fundamental needs are identified for each European country using the EU-SILC data. Among them, those associated with both objective poverty and subjective poverty are used to build a deprivation indicator at the individual level. Then, after identifying which variables are available at both the individual level (EU-SILC) and the area level (census), the area-level variables that are best correlated with the deprivation indicator built in the previous step are used to finally construct the area-based deprivation index. Details of concepts and construction methods are available in the previous methodologic papers.Citation18,Citation20

In France, this EDI is assigned to each IRIS (Îlot Regroupé pour l’Information Statistique, a geographical area of nearly 2000 individuals); it was then assigned to each patient from a given IRIS. The correspondence between a patient and an IRIS was determined according to the patient’s address at the time of diagnosis. This used a Geographic Information System software (ArcGIS 10.2) and a street map database (BD TOPO premium). In this work, we used the EU-SILC from 2006 to derive the EDI, which ranges for France from −17.3 to 51.1 (quintile 1 [Q1]: [−17.3;−2.9], Q2: [−2.9;−1.4], Q3: [−1.4;0], Q4: [0;2.1], and Q5: [2.1;51.1]).

The Excess Mortality Hazard

A relevant disease-specific mortality indicator is needed. Cancer-specific mortality using the cause of death is very popular but hardly usable in our context. Actually, the cause of death may be inaccurate or unreliable, especially for long-term studies, because it may be diversely coded over time and between regions. Besides, attributing a single cause of death to elderly people is debatable.Citation21 Alternative approaches called “EMH methods” have been then developed;Citation22Citation26 these do not require the knowledge of the cause of death.Citation27Citation30 The basic idea of EMH methods is comparing the mortality between cancer patients and noncancer subjects with the same sex, age, and other main characteristics. The mortality of cancer-free subjects, called “expected mortality”, is assumed to be correctly given by the general-population mortality, which is a known value. The EMH is then estimated by subtracting the expected mortality from the mortality of cancer patients; it provides the excess mortality due (directly or indirectly) to cancer at any time after diagnosis. For the expected mortality hazards, we used the French population mortality rates by sex, age (0–99 years), Département [French administrative area], and calendar year (1997–2013) as provided by the French Institut National de la Statistique et des Études Économiques.

From the EMH, we derived directly the net survival, using the classical relationship between hazard and survival. Net survival is then the probability of survival of cancer patients if the cancer under study was the only cause of death. In population-based studies, this key indicator allows comparisons between countries or periods and is not affected by differences in mortalities from other causes.Citation31

Regression modeling of the EMH

In cancer patients, the relationship between a prognostic factor (such as EDI) and EMH may be complex.Citation27,Citation32,Citation33 A multivariable regression model has to consider these complex relationships using flexible functions.Citation22,Citation34 We defined a “full model” that modeled the EMH (on a log scale) as a function of time, age at diagnosis, year of diagnosis, and EDI, with these last three variables having time-dependent coefficients and nonlinear functional forms (thus leading to time-dependent and nonlinear log excess hazard ratios [EHRs], as denoted hereafter).

In addition, the EDI being an ecological variable and the individuals living in a given area sharing similar characteristics (including the EDI variable), a specific statistical method should allow dealing with the hierarchical structure of the data (ie, multilevel data with dependence between individuals at each level).Citation7Citation9 This was done by including a normally distributed random effect at the IRIS level.Citation34

Thus, in formula, the “full model” for the EMH λ+ is written as follows: λ+(t,a,y,i|w)=λ0(t)exp(g(a)+h(t)a+j(y)+k(t)y+m(i)+n(t)i+w)where λ0(t) is the baseline hazard, a the age at diagnosis, y the year of diagnosis, i the EDI, and w the random effect defined at the IRIS level (with mean 0 and standard deviation σ). The logarithm of the baseline hazard and the functions h, k, n were modeled with quadratic B-splines with knots located at 1 and 5 years, and the nonlinear functional forms g, j, m were modeled using quadratic splines with one knot (located at 70 years for age at diagnosis, at 2000 for the year of diagnosis, and at 0 for the EDI).

Finally, because the estimated standard deviation of the random effect per se is difficult to interpret, we summarized the “importance” of the cluster level on the EMH using the median excess hazard ratio (MEHR).Citation35 This value reflects the influence of the cluster context as a whole, thus measuring the “general contextual effect”.Citation17,Citation35 The MEHR corresponds to the median relative change in the EMH when comparing identical subjects from two randomly selected different clusters that are ordered by risk.Citation35

The analysis was separately conducted in men and women and used the iterative model-building strategy recommended by Wynant and Abrahamowicz.Citation36 Starting with the “full model”, this strategy eliminates spurious time-dependent and nonlinear EHR functions of the three variables using the likelihood ratio test and 0.05 as significance threshold. This led to retain a final model for each sex-cancer couple. However, unlike the original proposal,Citation36 we kept by default the simplest EHR (ie, linear and time-constant) for each of the three variables.

To implement the advocated statistical methods, we developed a specific package named mexhaz (version 1.1), which runs on R software (version 3.2.0). Both the software and the package may be freely downloaded from the CRAN repository (https://cran.r-project.org/).

Indicators produced

For each sex-cancer couple, we predicted from the final model the age-standardized net survival (ASNS) at 1, 5, and 10 years after diagnosis per deprivation quintile of the French population using the International Cancer Survival Standard weights.Citation37 We used the delta method to derive the 95% confidence intervals (CIs) for the ASNSs assuming the normality of the log of the cumulative excess hazard.

The change in the EMH over the time elapsed since diagnosis was illustrated for three values of age and three values of the EDI: the 10th, 50th, and 90th percentiles of each variable distribution observed in each sex-cancer couple.

When the EDI was retained in the final model with a time-constant coefficient and a linear functional form, we reported the EHR for 1-unit increase of the EDI with its 95% CI. When the EDI was retained in the final model with time-constant coefficients and with a nonlinear form, we plotted the EHR vs the EDI values. When the EDI was retained in the final model with the time-dependent coefficient, we plotted the EHR vs the EDI values at various times after diagnosis. Because the sample size was usually small in this work, we focused on the effect size and its pattern rather than on the statistical significance in interpreting differences in function of the EDI.

For each sex-cancer couple, we calculated the MEHR with and without adjustment on the EDI from the final model to compare the general contextual effect on the EMH.

Results

Data description

Tables S1 and S2 display the number of cases and deaths over 10 years after diagnosis. The highest numbers of deaths were found in deprivation quintiles Q4 and Q5 that group the most deprived people. These deaths represent almost 50% of all events in most cancers (Tables S1 and S2). A few sex-cancer couples were not analyzed because of the low number of deaths (<300 in each of esophagus, liver, and larynx cancers in women; breast cancer in men; and thyroid cancer in men and women).

Deprivation (EDI)

A constant-in-time EHR of the EDI with a linear functional form was retained in most cancer sites, except lip–oral cavity–pharynx (LOCP; nonlinear EHR in both sexes), stomach (time-dependent EHR), pancreas (nonlinear and time-dependent EHR) in men, and cervix uteri (nonlinear and time-dependent EHR; Table S3).

Five-year ASNS according to EDI

In men, a substantial difference in 5-year ASNS was seen between deprivation quintiles Q1 (the least deprived) and Q5 (the most deprived) regarding LOCP cancers (41%; 95% CI: [38;43] vs 29% [27;31]). A similar difference was seen regarding skin melanoma (87% [84;89] vs 76% [72;80]; ). The difference in 5-year ASNS between Q1 and Q5 was nearly 7% for colon–rectum and bladder cancers, 6% for kidney, 5% for prostate cancer, 4% for lung and liver cancers, 3% for stomach and larynx cancers, and ≤2% for esophagus and pancreas cancers. Tables S4 and S5 show the results of 1- and 10-year ASNS by deprivation quintile. For pancreas cancer, the absence of the impact of EDI on the 5-year ASNS contrasts greatly with the substantial difference observed in 1-year ASNS (36% [33;40] in Q1 vs 25% [22;28] in Q5) (Table S4). This is due to a special time-dependent EHR of the EDI that we explain later.

Table 1 Age-standardized 5-year net survivals by cancer and deprivation quintiles (Q1–Q5, from the less to the more deprived) and EHRs for 1-unit increase of the EDI, in men and women, with their 95% confidence intervals

In women, as in men, a substantial difference in 5-year ASNS was observed between Q1 and Q5 regarding LOCP cancers, 55% [49;62] vs 43% [39;47], with a higher predicted ASNS for Q2 (60% [56;64]) than Q1. The difference in 5-year ASNS between Q1 and Q5 was around 6% for bladder cancer; 4% for breast and melanoma cancers; 3% for colon–rectum, lung, and ovary cancers; and ≤2% for CNS, pancreas, and stomach cancers. In contrast, for cervix uteri, the 5-year ASNS was lower in the less deprived women in comparison with the most deprived: 55% [51;59] for Q1 vs 64%–66% for the other quintiles. This pattern was also found for 1- and 10-year ASNS (Tables S4 and S5) for that cancer site.

Changes over time since diagnosis of the EMH according to EDI and age, and complementary illustrations of the relationship between EDI and EMH

The changes of the EMH over the time elapsed since diagnosis are given for three values of age and three values of EDI (the 10th, 50th, and 90th percentiles) for 1) LOCP in men, LOCP in women, and melanoma in men (); 2) pancreas and stomach in men and cervix uteri (); and 3) all other cancer sites (Figures S1–S4). Marked differences were seen by age at diagnosis; the EMHs were higher in old than in young patients, especially during the first year(s) after diagnosis. Changes of the EMH over time since diagnosis illustrate how and when the EDI impact takes place across the follow-up at specific ages at diagnosis and complement the previously given net survival results. For example, the graphs allow illustrating the strong association between the EDI and the EMH for LOCP cancer in both sexes: the curve is always higher in deprived people. A quick look at the graphs might give the false impression that the EHR of the EDI depends on time (see the middle box in where the curves are not parallel for LOCP in women aged 61.5 years). This is because the hazards are proportional on the log-scale, whereas the graphs use an arithmetic scale. On an arithmetic scale, a value of a time-constant EHR of 2, for example, will display a larger difference between hazards when the baseline hazard is high rather than low.

Figure 1 Changes over time since diagnosis of the excess mortality hazard for the 10th, 50th, and 90th percentiles of the age distribution (left, middle, and right column, respectively) and for the 10th, 50th, and 90th percentiles of the EDI distribution (curves with black circles, red triangles, and green crosses, respectively) regarding LOCP in men and women, and melanoma in men; patients diagnosed in 2010.

Abbreviations: EDI, European Deprivation Index; LOCP, lip–oral cavity–pharynx.
Figure 1 Changes over time since diagnosis of the excess mortality hazard for the 10th, 50th, and 90th percentiles of the age distribution (left, middle, and right column, respectively) and for the 10th, 50th, and 90th percentiles of the EDI distribution (curves with black circles, red triangles, and green crosses, respectively) regarding LOCP in men and women, and melanoma in men; patients diagnosed in 2010.

Figure 2 Changes over time since diagnosis of the excess mortality hazard for the 10th, 50th, and 90th percentiles of the age distribution (left, middle, and right column, respectively) and for the 10th, 50th, and 90th percentiles of the EDI distribution (curves with black circles, red triangles, and green crosses, respectively) regarding stomach and pancreas cancers in men, and cervix uteri; patients diagnosed in 2010.

Abbreviation: EDI, European Deprivation Index.
Figure 2 Changes over time since diagnosis of the excess mortality hazard for the 10th, 50th, and 90th percentiles of the age distribution (left, middle, and right column, respectively) and for the 10th, 50th, and 90th percentiles of the EDI distribution (curves with black circles, red triangles, and green crosses, respectively) regarding stomach and pancreas cancers in men, and cervix uteri; patients diagnosed in 2010.

For LOCP cancers in both sexes, the model-building strategy retained a nonlinear functional form (though time-constant) for the log-EHR of the EDI (Table S4). In men, the EHR increased according to EDI values but then plateaued in the more deprived people (); however, in women, a plateau is seen in both the least and the most deprived people ().

Figure 3 Excess hazard ratio of the EDI on lip–oral cavity–pharynx cancer in men (A) and women (B) with 95% confidence intervals (shaded area).

Notes: We limited the EDI values on the x-axis to the 5th and 95th percentiles of the observed EDI distribution in the sex-cancer couple. Rug plots indicate the locations of the observed EDI values.
Abbreviation: EDI, European Deprivation Index.
Figure 3 Excess hazard ratio of the EDI on lip–oral cavity–pharynx cancer in men (A) and women (B) with 95% confidence intervals (shaded area).

We also observed a substantial association between the EDI and the EMH in melanoma in men (bottom plots of ): the retained EHR of EDI was constant in time with a linear functional form (Table S3).

For stomach cancer in men, we observed higher EMHs in deprived patients starting from 5 years after diagnosis (, upper plots), and thus, weak differences between Q1 and Q5 regarding 1- and 5-year ASNS ( and Table S4) but a substantial difference (11%) regarding 10-year ASNS (Table S5). shows the time-dependent EHR of the EDI in stomach cancer, especially a substantial impact in late follow-up (5 years), even if the EMHs are quite low after 5 years ().

Figure 4 Excess hazard ratio of the EDI at different times after diagnosis for stomach (A) and pancreas (B) cancers in men and cervix uteri (C) cancer in women.

Notes: We limited the EDI values on the x-axis to the 5th and 95th percentiles of the observed EDI distribution in the sex-cancer couple. Rug plots indicate the locations of the observed EDI values.
Abbreviation: EDI, European Deprivation Index.
Figure 4 Excess hazard ratio of the EDI at different times after diagnosis for stomach (A) and pancreas (B) cancers in men and cervix uteri (C) cancer in women.

For pancreas cancer in men, we observed a very complex pattern associated with the EDI, especially a lower EMH within the first year in the less deprived patients vs other patients and, in contrast, a lower EMH between years 1 and 4 in deprived vs less deprived patients. Therefore, the impact of deprivation on net survival was high over the first year after diagnosis and resulted in a substantial difference in 1-year ASNS between the less deprived to the other patients (Table S4). This difference shrunk at 5 years because of the reverse association observed after 1 year (). At 6 months, the EHR is <1 at small EDI values (ie, in the less deprived patients) and ~1 at other values. At 3 years, the EHR is slightly >1 in the less deprived patients and slightly <1 in the more deprived (). At 5 years, the EHR should be interpreted with caution because the prognosis of pancreas cancer at 5 years is rather poor, and thus, the number of patients still at risk is rather low.

Finally, for cervix uteri cancer, the EMH was higher in the less deprived people than in people with a median EDI whatever the time since diagnosis and the age at diagnosis (). Therefore, the 1-, 5-, and 10-year ASNS was lower in the less deprived people than in others (, S4, and S5). This corresponds to very complex nonlinear and time-dependent EHRs of the EDI (), the main information relying on the U-shape of the curves (ie, EHRs >1 were observed in the least and the most deprived people).

General contextual effect

The MEHRs with and without adjustment on the EDI are given in Table S6. For LOCP cancers in men, the median increase in the EMH between similar patients from IRIS with a high vs a low excess mortality was 25.5% before adjustment on the EDI (MEHR=1.255) and 21.4% after adjustment (MEHR=1.214).Citation35 The figures in women were also substantial: the median increase in the EMH was 23.8% before adjustment vs 8.1% after adjustment. This reveals an important general contextual effect for LOCP cancers; the EDI seems to explain an important part of EMH variability between IRIS, especially in women. We also observed an important decrease (before vs after adjustment on the EDI) of the MEHR for prostate, melanoma, and pancreas cancers in men (Table S6).

Discussion

In an international context of increasing socioeconomic inequalities,Citation38 describing and quantifying the association between socioeconomic inequalities and the excess cancer-related mortality hazard is important. Here, we used a strategy able to deal with specific methodological requirements: the use of a relevant measure of deprivation and a relevant mortality indicator (the EMH) estimated using a flexible regression model able to deal with nonlinear and time-dependent associations. The approach should account for the fact that individuals within a cluster share similar characteristics and should also allow to summarize the “importance” of the cluster level on the EMH. We applied this approach to 17 solid tumors diagnosed in a specific area of France and followed up over 10 years after diagnosis to investigate the change over time of the excess mortality by age and socioeconomic level. We summarize the recommendations we believe important to describe the association between socioeconomic deprivation and the EMH ().

Table 2 Summary of the guidelines for describing the association between socioeconomic inequalities and cancer survival

Using population-based cancer registry data ensures depicting the full picture of cancer survival inequalities. For decades, the notifications of cancer cases come from many different sources (public and private pathology laboratories and hospital discharge databases as well as databases of the National Health System). Even if the number of data sources has dramatically increased since 1997, it was to collect further information on cancer cases such as treatment, thus not affecting the core of the cancer registry data and their exhaustiveness. For these reasons, we do not suspect any differential ascertainment over the study time period nor between areas of residence or individual and area-level socioeconomic determinants. We used the EDI to quantify the deprivation as this index was built to be reproducible in European countries.Citation18 We assumed that 1) the EDI assigned to each IRIS remains constant from 1997 to 2010, and 2) the patient’s deprivation corresponds to the EDI measured at the time of diagnosis (no misclassification). We considered these assumptions reasonable because 1) the crude level of the EDI has little significance per se: it is more the ranking of each IRIS across the overall distribution which is of interest and this ranking is less influenced by time, and 2) the number of patients moving after the diagnosis of cancer, which can be seen as a misclassification problematic, should be low for different reasons (access to cancer treatment, preservation of social network, etc). Bryère et al showed that the bias of such misclassification on the association between deprivation and cancer incidence was minimal in their study context.Citation39 However, more research should be conducted in the context of deprivation and cancer survival.

We recommend using flexible parametric regression models and underline the importance of examining the changes of the EMH over time since diagnosis together with the net survival ( and ) and the EHRs ( and ); this ensures relevant and complementary clinical information.Citation40Citation43 Indeed, at a given time t, the probability of net survival is a cumulative measure up to time t, whereas the EMH gives an instantaneous picture of what happens specifically at time t. It quantifies the instantaneous rate at which subjects experience an excess death (given they survived up to t) and, being a rate, the EMH may be >1. When the EMH is low (say <0.1) and practically constant over the year, its value is very close to the annual probability of death from the disease. With higher values, a back-transformation on the probability scale (using the classical relationship between hazard and survival) may be advantageous for clinical interpretation because it provides a conditional probability.

Caution should be taken when interpreting the changes of the EMH over time because its decrease in a population (“marginal” EMH) could be due either to true decreases of individual EMHs or to a “selection effect” over time.Citation44 For example, when a population includes a mix of 1) patients with localized cancer stages and low and constant-in-time EMH and 2) patients with advanced stages and high and constant-in-time EMH, the analysis of this population as a whole (in the absence of information on stage) will estimate a “marginal” EMH that will decrease with time. The more “frail” individuals (with the higher hazards) will die early, whereas the more “robust” individuals (with the lower hazards) will stay at risk (are “not selected to die”): the marginal EMH will then decrease and approach the EMH of the more robust subjects.Citation44 Nevertheless, the possibility to estimate and depict those quantities (EMH and EHR) over time using flexible functional forms is an important advantage of our proposed methodology compared to using a simpler model with either shape-restricted baseline hazard (such as monotonic for the Weibull distribution) or assuming proportional hazard ratio. As an illustration, we fitted a simple model without a random effect and assumed a Weibull distribution and linear and proportional hazard ratios for each prognostic factor. We applied this simple model to the LOCP cancer in men and in women, and to pancreas cancer in men. In LOCP cancers, using this simple model would not allow to identify the plateau of the EHR for the most deprived men nor both plateaus for the less deprived and the most deprived women (). From this simple model, the estimated EHR comparing women with EDI=4 to women with EDI=0 was 1.21 [1.09;1.34] compared to 1.50 [1.19;1.90] with our approach. Neglecting the time-dependent effect of the EDI for pancreas cancers with the simple model would also lead to a substantial oversimplification, showing no evidence of an association between the EDI and the EMH (EHR for 1-unit increase of the EDI=1.00 [0.98;1.022]), compared to the complex time-varying association found with our approach ().

We advocated the use of a model-building strategy to eliminate spurious time-dependent and nonlinear EHR functions from a flexible regression model. We used the one proposed by Wynant and Abrahamowicz,Citation36 but an alternative model-building strategy could be used, such as the one proposed by Royston and Sauerbrei.Citation45 However, the development of algorithms for model building is still an active area of statistical research, and studies comparing the ability of model-building strategies to eliminate spurious time-dependent and nonlinear EHR functions would be useful for giving advice to the analyst. Whatever the choice of the model-building strategy, fitting regression models requires observing enough events for providing reliable estimates, and this may be an issue in small sample studies or when studying cancer with a very good prognosis. In our work, we did not analyze some sex-cancer couples because of insufficient observed events for fitting the “full model” based on the “rule” of observing at least 10 events per parameters,Citation46 even though this “rule” is still debatable.Citation47

We evidenced an association (linear and constant-in-time) between the EDI and the EMH in colon–rectum, lung, melanoma, and prostate cancers in men as in breast cancer in women, with lower survivals in the most deprived. We also found a substantial deprivation gap in LOCP cancers in both sexes with >10% differences in 5-year ASNS between deprivation Q1 and Q5. The main drivers of LOCP cancer are alcohol and tobacco consumptions, and both are associated with other comorbidities; this limits the treatment possibilities and leads to poor prognoses. In France, the prevalence of tobacco smokers in men or women is generally higher in deprived than in most affluent people though women with management responsibilities seem more prone to smoking than others.Citation48,Citation49 Regarding alcohol consumption, the picture is more complex and differs with sex: excessive alcohol consumption is more frequent among women with management responsibilities vs other women but affects both extremes of the deprivation scale in men.Citation48 In addition, the probability of alcohol avoidance is quite high among deprived people.Citation48 These observations are in line with the patterns of the EHR of the EDI ().

For stomach cancer in men, deprived patients were found exposed to a higher excess mortality at 5 years after diagnosis vs less deprived patients, whereas the EDI plays no role at 1 or 3 years after diagnosis (). This may be due to 1) more comorbidities among deprived patients that may preclude the recourse to the best treatment strategies and lead to higher risks of relapse in the long term and/or 2) lower patient adherence to cancer follow-up among deprived patients. For cervix uteri, we showed a higher excess mortality among the less deprived patients (): it may be linked to a higher participation to cervical screening among the less deprived subjects,Citation50 which would eliminate a higher number of curable precancerous lesions in affluent than in deprived people.

The interpretation of such relationships would benefit from additional information on cancer stage at diagnosis and comorbidities. Such data were not available for the present study but French registries have started the systematic collection of stage at diagnosis. Another limitation of the study is the lack of deprivation-specific expected mortality rates in France. Therefore, the use of the general-population mortality as expected mortality rate overestimates the excess hazard in the more deprived people (because their expected mortality is usually higher than the “average” mortality in the general population) and underestimates it in the less deprived ones. This may lead to amplify the impact of the EDI and highlight the urgent need to produce deprivation-specific life tables in France.

In the present article, we predicted the ASNSs from the fitted regression model and obtained the ASNSs even in case of sparse data because model-predicted NSs can be obtained after the date of the last event in a specific stratum (which is another advantage of using our proposed methodology compared to using only nonparametric estimates). However, these predictions rely on the assumption that the regression model is correctly specified. For each sex-cancer couple, we checked the goodness of fit of the model by comparing the model-based ASNS with the nonparametric ASNS as given by the Pohar-Perme estimatorCitation24 for each deprivation quintile and each period of diagnosis ([1997–2000], [2001–2005], [2006–2010], and all periods combined). Comparing the 5-year ASNSs showed the good accuracy of model-based NS prediction (Figure S5).

Quantities that measure between- and within-cluster variability may help interpreting the results. We extended the median hazard ratio proposed by Austin and MerloCitation35 to our context of EMH to reach a better understanding of the impact of a within-IRIS clustering on the EMH. According to these authors, one would additionally compare the MEHR to the estimated EHR of each prognostic factor. However, in our final explanatory model, we rarely retained a single parameter for each prognostic factor, which makes impossible such a comparison. So, though the MEHR has the merit of simplicity, an interesting perspective would be to extend the approach proposed by Oliveira et al.Citation51 These authors derived an intra-class correlation coefficient for time-to-event regression models with a random effect (frailty). As in a linear model, this coefficient is defined as a ratio of variance components, which allows interpreting the coefficient as the proportion of the total variance due to the between-IRIS variability.Citation52 However, the approach proposed by Oliveira et al suits their specific models that include closed forms of marginal variance, which leads to closed forms of intraclass correlation.Citation51 A future work would check whether their approach may be applied to our model.

Evaluating the interactions between prognostic factors is a further important step when describing the association between deprivation and cancer survival. For example, the interactions allow checking whether the EHR of the EDI is the same whatever the age at diagnosis. In an exploratory analysis that used Royston and Sauerbrei’s methodologyCitation45 to study interaction, our preliminary results suggested that such interactions do exist for some cancers (results not shown). These results still need a validation of a robust statistical approach to test the interactions. Another important research area would be to extend this analysis by including socioeconomic measures defined at both the individual level and the area level. Indeed, with the EDI being an ecological variable,Citation53 the estimated effect of deprivation actually combines individual and contextual effects. Adjusting for both subject- and area-specific measures would allow disentangling individual from contextual effects of deprivation.Citation16,Citation54Citation56

International comparisons of the association between socioeconomic deprivation and cancer survival are useful to understand differences between health care systems. Several studies have already reported poorer prognoses in deprived vs less deprived cancer patients.Citation2,Citation4Citation6,Citation57 However, comparing the results is difficult because of distinct study designs, statistical analysis methods, and deprivation indexes. We hope the proposed approach will provide a methodological basis for such explorations. The use of the present approach with the EDI in other European countriesCitation20 will ease comparisons between European health care systems.

Author contributions

AB, LR, BR, OD, GL, and NB developed the concept and the design of the study. AB analyzed the data and drafted the manuscript. All authors interpreted the data, drafted the manuscript, revised it critically, and read and approved its final version.

Acknowledgments

We thank the IRESP (Institut de Recherche en Santé Publique) for supporting the study (grant for the ANGEFLEX study, Convention AAR2013-13 “Soutien à la recherche statistique et mathématique appliquée à la cancérologie”). This work was also partly supported by Cancer Research UK grant number C7923/A18525. The authors thank Jean Iwaz (Hospices Civils de Lyon) for the revision of the final draft of this manuscript.

Disclosure

The authors report no conflicts of interest in this work.

References

  • KogevinasMPortaMSocioeconomic differences in cancer survival: a review of the evidenceKogevinasMPearceNSusserMBoffettaPSocial Inequalities and CancerIARC Scientific Publications No. 138LyonInternational Agency for Research on Cancer1997177206
  • ColemanMPRachetBWoodsLMTrends and socioeconomic inequalities in cancer survival in England and Wales up to 2001Br J Cancer20049071367137315054456
  • WoodsLMRachetBColemanMPOrigins of socio-economic inequalities in cancer survival: a reviewAnn Oncol200617151916143594
  • ItoYNakayaTNakayamaTSocioeconomic inequalities in cancer survival: a population-based study of adult patients diagnosed in Osaka, Japan, during the period 1993–2004Acta Oncol201453101423143324865119
  • JansenLEberleAEmrichKGEKID Cancer Survival Working GroupSocioeconomic deprivation and cancer survival in Germany: an ecological analysis in 200 districts in GermanyInt J Cancer2014134122951296024259308
  • StanburyJFBaadePDYuYYuXQCancer survival in New South Wales, Australia: socioeconomic disparities remain despite overall improvementsBMC Cancer2016164826832359
  • SubramanianSVThe relevance of multilevel statistical methods for identifying causal neighborhood effectsSoc Sci Med200458101961196715020011
  • DuchateauLJanssenPThe Frailty ModelNew York, NYSpringer2008
  • WienkeAFrailty Models in Survival AnalysisBoca Raton, FLCRC Press2011
  • AntunesLMendoncaDBentoMJRachetBNo inequalities in survival from colorectal cancer by education and socioeconomic deprivation – a population-based study in the North Region of Portugal, 2000–2002BMC Cancer20161660827495309
  • KishJKYuMPercy-LaurryAAltekruseSFRacial and ethnic disparities in cancer survival by neighborhood socioeconomic status in Surveillance, Epidemiology, and End Results (SEER) RegistriesJ Natl Cancer Inst Monogr201420144923624325417237
  • DiallaPOArveuxPOuedraogoSAge-related socio-economic and geographic disparities in breast cancer stage at diagnosis: a population-based studyEur J Public Health201525696697225829506
  • TownsendPPhillimorePBeattieAHealth and Deprivation: Inequality and the NorthLondonCroom Helm1988
  • CarstairsVMorrisRDeprivation: explaining differences in mortality between Scotland and England and WalesBMJ198929967048868892510878
  • WoodsLMRachetBColemanMPChoice of geographic unit influences socioeconomic inequalities in breast cancer survivalBr J Cancer20059271279128215798765
  • Diez RouxAVInvestigating neighborhood and area effects on healthAm J Public Health200191111783178911684601
  • Diez RouxAVA glossary for multilevel analysisJ Epidemiol Community Health200256858859412118049
  • PornetCDelpierreCDejardinOConstruction of an adaptable European transnational ecological deprivation index: the French versionJ Epidemiol Community Health2012661198298922544918
  • TownsendPDeprivationJ Soc Policy1987162125146
  • GuillaumeEPornetCDejardinODevelopment of a cross-cultural deprivation index in five European countriesJ Epidemiol Community Health201670549349926659762
  • Moreno-BetancurMSadaouiHPiffarettiCReyGSurvival analysis with multiple causes of death: extending the competing risks modelEpidemiology2017281121927362647
  • RemontetLBossardNBelotAEstèveJFrench network of cancer registries FRANCIMAn overall strategy based on regression models to estimate relative survival and model the effects of prognostic factors in cancer survival studiesStat Med200726102214222816900570
  • EstèveJBenhamouECroasdaleMRaymondLRelative survival and the estimation of net survival: elements for further discussionStat Med1990955295382349404
  • PoharMPStareJEstèveJOn estimation in relative survivalBiometrics201268111312021689081
  • DanieliCRemontetLBossardNRocheLBelotAEstimating net survival: the importance of allowing for informative censoringStat Med201231877578622281942
  • RocheLDanieliCBelotACancer net survival on registry data: use of the new unbiased Pohar-Perme estimator and magnitude of the bias with the classical methodsInt J Cancer2013132102359236922961565
  • BossardNVeltenMRemontetLSurvival of cancer patients in France: a population-based study from the Association of French Cancer Registries (FRANCIM)Eur J Cancer200743114916017084622
  • MariottoABNooneAMHowladerNCancer survival: an overview of measures, uses, and interpretationJ Natl Cancer Inst Monogr201420144914518625417231
  • De AngelisRSantMColemanMPCancer survival in Europe 1999–2007 by country and age: results of EUROCARE-5 – a population-based studyLancet Oncol2014151233424314615
  • AllemaniCWeirHKCarreiraHCONCORD Working GroupGlobal surveillance of cancer survival 1995–2009: analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2)Lancet20153859972977101025467588
  • Pohar PermeMEsteveJRachetBAnalysing population-based cancer survival – settling the controversiesBMC Cancer201616193327912732
  • QuantinCAbrahamowiczMMoreauTVariation over time of the effects of prognostic factors in a population-based study of colon cancer: comparison of statistical modelsAm J Epidemiol1999150111188120010588079
  • MollerHSandinFRobinsonDColorectal cancer survival in socioeconomic groups in England: variation is mainly in the short term after diagnosisEur J Cancer2012481465321676610
  • CharvatHRemontetLBossardNCENSUR Working Survival GroupA multilevel excess hazard model to estimate net survival on hierarchical data allowing for non-linear and non-proportional effects of covariatesStat Med201635183066308426924122
  • AustinPCWagnerPMerloJThe median hazard ratio: a useful measure of variance and general contextual effects in multilevel survival analysisStat Med201736692893827885709
  • WynantWAbrahamowiczMImpact of the model-building strategy on inference about nonlinear and time-dependent covariate effects in survival analysisStat Med201433193318333724757068
  • CorazziariIQuinnMCapocacciaRStandard cancer patient population for age standardising survival ratiosEur J Cancer200440152307231615454257
  • Commission EWhy Socio-Economic Inequalities Increase? Facts and Policy Responses in Europe1st edLuxembourgPublications Office of the European Union2010
  • BryereJPornetCDejardinOLaunayLGuittetLLaunoyGCorrection of misclassification bias induced by the residential mobility in studies examining the link between socioeconomic environment and cancer incidenceCancer Epidemiol201539225626425579981
  • BinquetCAbrahamowiczMAstrucKFaivreJBonithon-KoppCQuantinCFlexible statistical models provided new insights into the role of quantitative prognostic factors for mortality in gastric cancerJ Clin Epidemiol200962323224019070464
  • CormSRocheLMicolJBChanges in the dynamics of the excess mortality rate in chronic phase-chronic myeloid leukemia over 1990–2007: a population studyBlood2011118164331433721849485
  • HessKRLevinVAGetting more out of survival data by using the hazard functionClin Cancer Res20142061404140924501392
  • MounierMBossardNRemontetLEUROCARE-5 Working GroupCENSUR Working Survival GroupChanges in dynamics of excess mortality rates and net survival after diagnosis of follicular lymphoma or diffuse large B-cell lymphoma: comparison between European population-based data (EUROCARE-5)Lancet Haematol2015211e481e49126686258
  • VaupelJWYashinAIHeterogeneity’s ruses: some surprising effects of selection on population dynamicsAm Stat198539317618512267300
  • SauerbreiWRoystonPLookMA new proposal for multivariable modelling of time-varying effects in survival data based on fractional polynomial time-transformationBiom J200749345347317623349
  • VittinghoffEMcCullochCERelaxing the rule of ten events per variable in logistic and Cox regressionAm J Epidemiol2007165671071817182981
  • HeinzeGWallischCDunklerDVariable selection – a review and recommendations for the practicing statisticianBiom J Epub201812
  • Com-RuelleLDourgnonPJusotFLengagnePLes problèmes d’alcool en France: quelles sont les populations à risque? [The problems of alcohol in France: what are at-risk populations?]Institut De Recherche Et Documentation En Economie De La Santé200812916 French
  • GuignardRBeckFRichardJBLermenierAWilquinJLNguyen-ThanhVLa consommation de tabac en France en 2014: caractéristiques et évolutions récentes [The use of tobacco in France in 2014: characteristics and recent developments]Évolutions20153116 French
  • DuportNSerraDGoulardHBlochJWhich factors influence screening practices for female cancer in France?Rev Epidemiol Sante Publique2008565303313 French18951740
  • OliveiraIRMolenberghsGDemetrioCGDiasCTGioloSRAndradeMCQuantifying intraclass correlations for count and time-to-event dataBiom J201658485286726899931
  • GoldsteinHBrowneWRasbashJPartitioning variation in multilevel modelsUnderstand Stat200214223231
  • BryereJPornetCCopinNAssessment of the ecological bias of seven aggregate social deprivation indicesBMC Public Health20171718628095815
  • Diez-RouxAVMultilevel analysis in public health researchAnnu Rev Public Health20002117119210884951
  • SloggettAYoungHGrundyEThe association of cancer survival with four socioeconomic indicators: a longitudinal study of the older population of England and Wales 1981–2000BMC Cancer200772017254357
  • SingerSBartelsMBriestSSocio-economic disparities in long-term cancer survival-10 year follow-up with individual patient dataSupport Care Cancer20172551391139927942934
  • SkyrudKDBrayFEriksenMTNilssenYMollerBRegional variations in cancer survival: impact of tumour stage, socioeconomic status, comorbidity and type of treatment in NorwayInt J Cancer201613892190220026679150