710
Views
6
CrossRef citations to date
0
Altmetric
Original Articles

A method for differentiating cancer prevalence according to health status, exemplified using a population-based sample of Italian colorectal cancer cases

, , , , , , , , , , , & show all
Pages 294-302 | Received 13 Aug 2012, Accepted 22 Oct 2012, Published online: 10 Dec 2012

Abstract

Cancer prevalence is the proportion of a population diagnosed with cancer. We present a method for differentiating prevalence into the proportions expected to survive without relapse, die of cancer within a year, and die of cancer within 10 years or survive with relapse at the end of the 10th year. Material and methods. The method was applied to samples of colorectal cancer cases, randomly extracted from four Italian cancer registries (CRs). The CRs collected data on treatments, local relapses, distant relapses, and causes of death: 1) over the entire follow-up to 31 December 2007 for 601 cases diagnosed in 2002 (cohort approach); 2) over a single year (2007) for five cohorts of cases defined by year of diagnosis (from 1997 to 2001), alive at 1 January 2007 (total 298 cases). The cohorts were combined into a fictitious cohort with 10 years survival experience. For each year j after diagnosis the health status of cases alive at the beginning of j was estimated at the end of the 10th year. From these estimates the 10-year colorectal cancer prevalence was differentiated. Results. We estimated: 74.7% alive without relapse or not undergoing treatment at the end of 10 years; 8.1% had died of colorectal cancer within a year; 11.4% had died of colorectal cancer 1–10 years after diagnosis or had relapsed or were undergoing treatment at the end of the 10th year; and 5.8% had died of other causes. Conclusions. We have introduced a new method for estimating the healthcare and rehabilitation demands of cancer survivors based on CR data plus treatment and relapse data specifically collected for samples of cases archived by CRs.

Total cancer prevalence – the proportion of a population diagnosed with cancer at any one time – is a measure of the cancer burden in the population, usually estimated directly from population-based cancer registry (CR) data. Cancer prevalence includes several categories of survivors: those recently diagnosed and undergoing primary treatment, those in remission who may or may not be cured and may be receiving rehabilitation interventions, those receiving treatment for relapse, and those receiving end of life care [Citation1]. According to CR-based estimates from RARECARE, on 1 January 2003 there were 3566 prevalent cancer cases per 100 000 in the European Union, for a total prevalence of about 17.8 million [Citation2]. Due to population ageing with attendant increase in incidence, and also better survival, cancer prevalence is projected to increase in the future [Citation3].

Prevalence data can be subdivided according to year of diagnosis, to estimate prevalence by time from diagnosis [Citation4]. The Italian Association of Cancer Registries (AIRTUM) estimated that there were 2.2 million prevalent cancer cases in Italy on 1 January 2006, 21% of which had been diagnosed in the preceding two years, 22% in the preceding 2–5 years, 23% in the preceding 5–10 years, and 34% over 10 years previously [Citation5]. These prevalence categories correspond to groups with differing healthcare needs. However, for persons diagnosed five years and more before the prevalence date, health status is likely to vary greatly between individuals, and prevalence classification simply by time from diagnosis is unlikely to provide sufficiently detailed information on health status for healthcare planning [Citation1]. It is also possible to apply cure models to CR survival data to estimate the proportion of cured patients. This is the fraction of the total prevalence with same life expectancy as the general population of the same age and sex (statistically the cured fraction is the limiting value of the cumulative relative survival function) [Citation6–8]. EUROCARE estimated that for all European cancer patients diagnosed 1988–1999, the proportion cured of their cancer varied from 38% (Poland) to 59% (France) in women, and from 21% (Poland) to 47% (Iceland) in men [Citation9]. The EUROCHIP-3 project [Citation10] suggested indicators to describe the rehabilitation status of persons with a cancer diagnosis at the population level in Europe: it emphasized the importance of collecting indicators making it possible to differentiate prevalent cases by clinical status (cases expected to die of cancer within a year, cases expected to die of cancer more than a year, and cases expected to survive without relapse). To do this, CR data have to be combined with information on treatment (or use of health service facilities) and follow-up (to identify recurrences, and second cancers) to produce estimates of what EUROCHIP-3 called differentiated prevalence. This has been shown to be feasible in the UK by linking hospital discharge data with CR data [Citation1], and in France [Citation11] and Italy [Citation12] by collecting data on recurrences and treatments from clinical records of samples of cases archived by CRs. The present study arises out of the CAREMORE (Cancer Registry Model on Rehabilitation) project [Citation13]. Its aim is to present a new method for estimating the differentiated prevalence of cancer patients diagnosed up to 10 years previously. To illustrate the method we use a representative sample of colorectal cancer cases obtained from four Italian population-based CRs. We collected clinical and rehabilitation data on samples cases using the cohort and the period approaches. Using the cohort approach we considered 601 cases diagnosed in 2002 and investigated for cancer-related treatment and events continuously up to 31 December 2007. For the period approach we considered 298 cases (five separate cohorts defined by year of diagnosis, 1997–2001) and followed for cancer-related treatment and events through 2007. The essence of our method is that we combine the data collected by the two approaches to construct a fictitious cohort on which we estimate 10-year experience, even though complete 10-year follow-up is lacking.

The CAREMORE project was approved by the Ethics Committee of the National Cancer Institute (Istituto Nazionale dei Tumori) of Milan.

Material and methods

Data collection

CAREMORE [Citation13] collected data on health and rehabilitation status subsequent to diagnosis for CR-archived cases of breast cancer, colon cancer, rectal cancer and lymphoma. We here consider colon and rectal cancer cases randomly sampled [Citation14] from the databases of the Italian CRs of Genoa and Reggio Emilia (northern Italy), and Sassari (Sardinia) and Ragusa (southern Italy). Genoa collected data on rectal cancer only; Sassari on colon cancer only; Reggio Emilia and Ragusa on both sites. All cases were primary cancers in adults (> 19 years).

Cases were identified by codes C18 (colon cancer) and C20 (rectal cancer) of the 10th revision of the International Classification of Diseases [Citation15]. Cases evident only from death certificates or diagnosed only at autopsy were excluded.

CR personnel assessed hospital discharge record databases and clinical records for the selected cases, abstracting date of diagnosis, treatments (surgery, chemotherapy, radiotherapy), vital status (including cause of death) and cancer-related events (local and distant relapse, second cancer). This information is not routinely available to CRs and had to be specifically abstracted from patients’ records, and checked, for the purposes of this study. Two experienced CR personnel examined data on cases that had died to distinguish cancer deaths from those due to other causes. Deaths due to colon cancer in patients with rectal cancer were considered related to rectal cancer, and vice versa.

Cohort and period approaches to data

We employed two methodological approaches as illustrated in the Lexis diagrams of : 1) Using the cohort approach (ABCD in ) we collected clinical information on a cohort of cases diagnosed over one year (2002) and followed up for 4–6 years; 2) Using the period approach (EDVW in ) we collected information on five separate cohorts of patients diagnosed with cancer 6, 7, 8, 9 or 10 years previously, alive at the beginning of 2007; we collected information on events and treatments for one year only (2007).

Figure 1. Lexis diagrams illustrating the methods used to combine data from the cohort and period approaches to produce a fictitious cohort on which differentiated prevalence was estimated. (a) Illustration of cohort (ABCD) and period (EDWV) approaches; (b) Construction of fictitious cohort (ABXY) combining data from the cohort and period approaches; (c) Estimation of differentiated prevalence from fictitious cohort.

Figure 1. Lexis diagrams illustrating the methods used to combine data from the cohort and period approaches to produce a fictitious cohort on which differentiated prevalence was estimated. (a) Illustration of cohort (ABCD) and period (EDWV) approaches; (b) Construction of fictitious cohort (ABXY) combining data from the cohort and period approaches; (c) Estimation of differentiated prevalence from fictitious cohort.

For the cohort approach, each participating CR collected data from diagnosis up to 31 December 2007 for about 100 incident cases sampled from all incident cases diagnosed in 2002. In fact a total of 601 cases were considered for the cohort approach. For rectal cancer cases, if there were insufficient cases in 2002 in any CR, all incident cases in 2002 were included, and the remaining cases were randomly sampled from cases incident in 2003.

For the period approach, starting from a list of cases diagnosed between 1 January 1997 and 31 December 2001, and alive at 1 January 2007 (5–10 year prevalent cases) each CR randomly extracted about 30 cases for each sub-site and for each year of diagnosis (1997–2001), for a total of 298 cases in all. For these cases, information on hospital admissions and treatments during the year of interest only (1 January – 31 December 2007) was collected.

Construction of fictitious cohort

The data collected by the two approaches were combined to estimate the disease experience of a fictitious cohort up to 10 years from diagnosis (ABXY, ).

Let Nj be the number of cases alive at the beginning of the kth interval (365 days interval between j and j + 1) such that k = j + 1 = 1…10 and j = 0…9. We can divide Nj into four groups:

(a) Dcj,j + 1: patients alive in j who died of their cancer before j + 1;

(b) Docj,j + 1: patients alive in j who died of other causes before j + 1;

(c) Ij,j + 1: patients alive in j considered ‘ill’ (defined below) on at least one day before j + 1;

(d) Hj, j + 1: patients alive in j with no evidence of colorectal cancer at any point before j + 1.

In the cohort approach, patients are considered ‘ill’: 1) during the entire first year (365 days) after diagnosis (i.e. H0,1 = 0); 2) during the interval k (for k = 2…5) if they receive surgical treatment, chemotherapy or radiotherapy; 3) during the intervals k or k = 1 (for k = 2…5) if they experience an event; 4) during the intervals k = 4,5 if they receive treatment in the interval k = 1 but not k = 2. An event is defined as a local or distant relapse or death for colorectal cancer; a second cancer at a different site is not an event. The ideas behind these rules are that: 1) in the first year all cases are ill with their cancer; 2) if a case receives chemotherapy, radiotherapy or surgery without a follow-up event (local or distant relapse, death) the case is considered ‘ill’ for the 365 day interval in question; 3) if the case has a follow-up event he/she is considered ‘ill’ for the year (interval) of the event and the successive year; 4) if (cancer-related) treatment is given more than one year after the previous treatment it is inferred that the person has had an event even though this may not be specified.

In the period approach, patients are considered ‘ill’ in the interval k = 6…10 if they had treatment or an event (as defined previously) during the year considered (2007).

Thus, if N0 is the number of cases investigated by the cohort approach, the total number of cases Nj alive in each interval between j and j + 1 is given by: Nj = Ij−1,j + Hj−1,j with j = 1…9.

For j = 1…5, the Nj s are obtained from the cohort approach; for j = 6…9, the Nj s are obtained applying the experience of the period approach to the Nj s obtained from the cohort approach (EDVW in ).

Thus the health experience of our fictitious cohort is equivalent to that of the cohort of cases diagnosed in 2002 and followed to 2007, while from 2008 to 2012 it is assumed that the fictitious cohort's health experience is that experienced by the five cohorts of prevalent cases diagnosed in the years from 1997 to 2001 and investigated for recurrence and treatment in 2007 (6–10 years after diagnosis). As illustrated in , information from EDFG was applied to ABCD and used to estimate CDHI, and information from GFJK was applied to ABHI and used to estimate HILM and so on, so as to arrive at an estimate for the entire fictitious cohort ABXY.

Estimation of differentiated prevalence

For each Nj with j: 0…9 it is now possible to define the health status from j up to 10 years:

(a) Dcjj + 1: patients alive in j who died of their cancer between j and j + 1;

(b) Docj→10: patients alive in j who died of other causes between j and 10;

(c) Dcj+ 1→10 + I10: patients alive in j who died of their cancer between j + 1 and 10 or who were ill in the 10th year after diagnosis;

(d) Hj→10: patients alive in j with no evidence of colorectal cancer in the 10th year after diagnosis.

If we consider each individual Nj (j: 0…9) to have a health experience independent of that of the other Nj s, then we can use each Nj's health experience up to 10 years (from j to 10) and apply it to other cohorts alive in the specific year j after diagnosis (i.e. 10-year prevalence). This is illustrated in (for 1 January 2006, as the available data on prevalence in Italy have this index date [Citation5]). In this way, the average of previous measures (weighted per Nj) can be used to differentiate the 10-year prevalence into disease-free prevalence (derived from Hj→10), end of life prevalence (from Dcj→j + 1) and relapse prevalence (from Dcj+ 1→10 + I10). Furthermore, patients expected to die of causes unrelated to their cancer can be estimated (from Docj→10).

Results

The CRs were able access information on treatment for over 98% of cases, on vital status for 99.8% of cases, and on cause of death for 97.5% of deaths.

summarizes the information collected by CRs for the period approach. There were no colorectal cancer deaths in any interval for colon cancer cases. The initial lines of and show, for colon and rectal cancer cases, respectively, data collected for the cohort approach. At the end of interval (4,5), 45.7% of colon cases (137/300) and 41.2% of rectal cases (124/ 301) were estimated alive without evidence of disease.

Table I. Health status distribution of colon and rectal cancer cases collected for the period approach. All ages and both sexes.

Table IIa. Colon cancer: Health status distribution of fictitious cohort by each annual intervals. All ages, both sexes.

Table IIb. Rectal cancer: Health status distribution of fictitious cohort by each annual intervals. All ages, both sexes.

and show, for colon cancer and rectal cancer respectively, estimates of health status distribution for the fictitious cohort for annual intervals. Estimates were obtained combining information from the cohort and period approaches.

and show, for colon cancer and rectal cancer respectively, health experience over 10 years in the various groups of cancer patients alive at the beginning of each interval. The last row, derived from the previous rows, show 10-year prevalence by health status (differentiated prevalence).

Table IIIa. Colon cancer: Health status distribution of fictitious cohort at 10th year after diagnosis for each interval (j,10). All ages, both sexes.

Table IIIb. Rectal cancer: Health status distribution of fictitious cohort at 10th year after diagnosis by each interval (j,10). All ages, both sexes.

For example, from , three years after diagnosis, 7.2% of the 166 prevalent rectal cancer cases were estimated to have died of their cancer within a year, 7.0% were estimated to have died of other causes before the 10th year, 15.9% were estimated to have died of their cancer between the 4th and the 10th years after diagnosis or to be ill 10 years after diagnosis, while 69.9% were estimated to not have evidence of disease 10 years after diagnosis.

shows estimates of 10-year differentiated prevalence for colon and rectal cancer, for men and women separately, cases less than and over 75 years (at latest follow-up), and for all ages and both sexes. Disease-free prevalence was higher for women than men and for patients less than 75 years of age.

Table IV. Ten-year differentiated prevalence estimates for colon and rectal cancer, by sex and age at end of follow-up.

From previously published data, 10-year colorectal cancer prevalence estimated in 2006 was 397/ 100 000 for Reggio Emilia, Genoa, Ragusa and Sassari, and 393/100 000 for Italy as a whole [Citation5]. Since the estimates for the areas covered by the four CRs subject of the present study are closely similar to the national estimate, we may tentatively apply our 10-year differentiated prevalence estimates () to the whole of Italy (not forgetting the limitations of this extension due to the small size of case sample studied). Thus from the 142 051 cases of colon cancer and 66 754 cases of rectal cancer prevalent in Italy at 1 January 2006 [Citation5] we derive the following differentiated prevalence estimates for all Italy: 1) 156 005 (74.7%) prevalent cases free of cancer-related treatment and cancer recurrence at the end of the 10th year after diagnosis (disease- free prevalence); 2) 16 863 (8.1%) deaths due to colorectal cancer within a year (end of life prevalence); 3) 23 762 (11.4%) prevalent cases that will die of colorectal cancer than a year but before the 10th year after diagnosis, or will be undergoing treatment for colorectal cancer at the end of the 10th year after diagnosis (relapse prevalence); 4) 12 175 (5.8%) deaths from causes other than colorectal cancer by the 10th year after diagnosis.

Discussion

Cancer prevalence is useful for health planners as it indicates total cancer burden in a population at a given time. However, prevalence consists of groups of patients making different demands of health resources.

Previous studies have developed methods of differentiate cancer prevalence into sub-groups with similar health status and healthcare needs. Using the period approach, Maddams et al. analyzed the prevalent cancer cases in 2006 extracted from the English National CR database, finding that 85% of colorectal cancer cases were not hospitalized in 2006 [Citation1]. Gatta et al. used a cohort approach to estimate the proportions of fatal and recurrence-free cases by analysis of follow-up data from representative sample of 278 colon cancer cases archived in a northern Italian CR [Citation12].

The present paper combined the cohort and the period approaches to data collection, to differentiate prevalent cases, over the 10 years following diagnosis, according to health status and healthcare needs, and not simply in terms of hospitalization. The method was applied to colorectal cancers, as previously [Citation12], but we investigated more cases (from four Italian CRs) and collected information on all cancer treatments as well as cancer-related events during follow-up. The advantage of this combined approach is that it enabled us to estimate events and treatments over 10 years without complete 10-year follow-up data being available. If we had used only the cohort approach, we would not have had real follow-up data until at least two/three years after the conclusion of the 10-year period, because this time is required for data collection and processing.

The combined approach also enabled us to subdivide the 10-year prevalent cases into groups characterized by health status and healthcare needs: 1) Disease-free prevalent cases not undergoing cancer treatment and without distant or local relapse; 2) End of life prevalent cases expected to die of colorectal cancer within a year and probably requiring terminal care and home assistance; 3) Relapsed cases requiring healthcare or rehabilitation, expected either to die of colorectal cancer more than a year, but less than 10 years, after diagnosis; or receiving treatment for colorectal cancer in the 10th year.

To provide indications of the reliability of our estimates, we compared them with previously available data. Thus, we estimated that 81% of prevalent cases (75% disease-free prevalence, plus 6% expected to die of other causes) did not require cancer hospitalization, and this is similar to the 85% reported in a UK study [Citation1]. Using only data from the cohort approach, we estimated that 45.7% of colon cancer and 41.2% of rectal cancer cases had no evidence of disease after five years of follow-up. In comparison, using completely different models [Citation6–8] a EUROCARE study estimated the cured fraction in Italy to be 45.8% for colon cancer and rectal cancer combined [Citation9]. We also estimated that 50% of cases (300/601, and ) were alive five years after diagnosis, and this is similar to the 48.1% observed five-year survival reported by AIRTUM in 2000–2004 for colorectal cancer both sexes combined [Citation16]. Finally, 241 cases in our fictitious cohort were alive after 10 years of follow-up, and 61 had died of non-colorectal cancer causes (302 cases, 50.2% of total). This figure is not greatly different from the 54.7% 10-year relative survival estimated by AIRTUM for Italian colorectal cancers, both sexes, diagnosed from 2002 to 2004 [Citation16].

An interesting finding of our study () was that by eight years after diagnosis, prevalent cases no longer died of colorectal cancer, and (as is evident from and ) the proportion of ill cases had decreased rapidly to zero. The implication is that all prevalent cases alive 10 years after diagnosis of follow-up should be considered disease-free, or cured of their cancer.

The main aim of the CAREMORE project was to evaluate the feasibility of collecting population-based information on the rehabilitation services used by cancer patients. The personnel of the CRs involved in CAREMORE were therefore mainly engaged in finding new data sources for rehabilitation services, so there were insufficient resources available to finance the collection of treatment/relapse data on larger series of cases for the present study. Due to this resource limitation we asked CRs to collect treatment/relapse information on numerically equal numbers of colon and rectal cancer cases; it is for this reason that we analyzed colon and rectal cancers separately (since incidence and prevalence are greater for colon cancer, we should have had a larger cohort for the latter cancer).

The small absolute number of cases is the main limitation of our study. In fact, the health status of the fictitious cohort 5–10 years after diagnosis was estimated on about 30 cases per year. Nevertheless, the 5–10 year colorectal cancer prevalence group was highly homogeneous, mostly consisting of patients cured of their disease (with low probability of recurrence and survival probability similar to that of the general population). This is in line with finding on larger populations [Citation17–19] and suggests that findings on our small sample of cases are representative of the entire population over the interval studied.

We note that our random sample of cases was not stratified by sex and age so there might be differences between the cohort and period samples with respect to these variables and this might negatively influence the validity of our fictitious cohort. However, at the end of follow-up, the average age of the cohort sample was 72.4 years while the average ages for the period samples varied from 71.9 to 74.7 years. Furthermore males made up 55–60% in the cohort sample (at the end of follow-up) and for most period samples the percentage of males was similar: exceptions were the second (GFJK ) and fifth (SRVW, ) periods where the percentage of males was lower. However survival for colon and rectal cancer does not differ between men and women. In fact, for Italian patients diagnosed from 1995 to 1999 and followed to 2003 five-year relative survival was 58% for colon cancer in men and women, and 54% in men and 56% in women for rectal cancer [Citation20]. We therefore consider it acceptable to combine different cohort and period samples to form our fictitious cohort.

Due to the small number of cases, analysis according to CR was not possible; this is another limitation since the four CR areas we considered differ in terms of cancer incidence and mortality [Citation21].

With regard to information on treatment, this was mainly obtained from the national database of hospital discharges. These records contain data on hospital admissions for the whole of Italy not just the area covered by the CR that archived the cases. Some patients receive chemotherapy and radiotherapy on an outpatient basis and may not be in entered hospital discharge databases. We also accessed outpatient databases to be sure of capturing these treatments [Citation22]. Since discharge records are national, completeness is high even for southern CRs (Ragusa and Sassari) whose patients often travel to hospitals in the north of the country for treatment.

To conclude, the present study has shown that CRs are able to collect data for samples of the cases they archive on treatment and clinical follow-up so as to provide information to permit the differentiation of prevalence into the categories illustrated in this study. The prevalence differentiation method we propose can produce up-to-date estimates, useful for health planners, regarding proportion of prevalent cancer cases needing care, rehabilitation services and terminal care. Our method is likely to be applicable to other cancer sites and also beyond the 10th year after diagnosis. For breast cancer, for example, differentiation of prevalence up to 15 years after diagnosis would be advisable.

Acknowledgements

The authors thank Don Ward for help with the English.

The CAREMORE group

Liguria Region Cancer Registry: Enza Marani, Maria Antonietta Orengo; Reggio Emilia Cancer Registry: Lucia Mangone, Carlotta Pellegri, Enza Di Felice; Ragusa Cancer Registry: Giuseppe Cascone, Sonia Cilia, Gabriele Morana, Carmela Nicita, Concetta Rollo, Aurora Sigona, Eugenia Spata, Giovanna Spata; Sassari Cancer Registry: Mario Budroni, Rosaria Cesaraccio; Varese Cancer Registry: Paolo Contiero, Anna Maghini, Giovanna Tagliabue; Federazione Italiana delle Associazioni di Volontariato in Oncologia: Francesco De Lorenzo, Laura Del Campo, Flaminia Polacchi; Associazione Senza Limiti: Fulvio Aurora, Dario Vittone; Centro di Ricerca sulla Gestione dell’Assistenza Sanitaria e Sociale (CERGAS): Amelia Compagni, Giovanni Fattore; Fondazione IRCCS ‘Istituto Nazionale dei Tumori’, Descriptive Studies and Health Planning Unit, Milan: Ilaria Casella, Agata Cifalà, Milena Sant; Fondazione IRCCS ‘Istituto Nazionale dei Tumori’, Evaluation Epidemiology unit, Milan: Gemma Gatta, Annalisa Trama; Fondazione IRCCS ‘Istituto Nazionale dei Tumori’, Milan, Scientific Director's Office: Valeria Anselmi, Claudia Casoli.

Declaration of interest: The CAREMORE project has received funding from the Italian Ministry of Health (Integrated Program on Oncology No 7, PIO-7). The present work was also supported by the Italian Association for Cancer Research (AIRC). The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.