64
Views
23
CrossRef citations to date
0
Altmetric
Original Research

Population-based Aarhus Sarcoma Registry: validity, completeness of registration, and incidence of bone and soft tissue sarcomas in western Denmark

, , , , &
Pages 45-56 | Published online: 06 Mar 2013

Abstract

Background:

The aim of the present study was to validate the data in the Aarhus Sarcoma Registry (ASR), to determine if this registry is population-based for western Denmark, and to examine the incidence of sarcomas using validated, population-based registry data.

Methods:

This study was based on patients with bone and soft tissue sarcoma treated at the Sarcoma Centre of Aarhus University Hospital between January 1, 1979 and December 31, 2008. The validation process included a review of all medical files by two researchers using a standardized form. The Danish Cancer Registry was used as a reference to assess the completeness of registration of patients in the ASR. Crude and World Health Organization age-standardized incidence, as well as age-, gender-, and year-specific incidences were estimated.

Results:

The validation process added 385 to the 1442 patients who were registered in the ASR. Before validation, on average, 70.5% of the data for the variables was correct. Validation improved the average completeness of the registered variables from 83.7% to 99.3%. The 1827 patients in the ASR after validation include 85.3% of the patients registered in the Danish Cancer Registry. The overall World Health Organization age-standardized incidence of sarcoma in the trunk or extremities in western Denmark in the period 1979–2008 was 2.2 per 100,000, being 0.8 for bone sarcomas and 1.4 for soft tissue sarcomas.

Conclusion:

The validation process significantly improved the completeness of the variables and the quality of the ASR data. ASR is now a valuable population-based tool for epidemiological research and quality improvement in the treatment of sarcoma. It is our recommendation that documented validation of registries should be a prerequisite for publishing studies derived from them.

Introduction

Sarcomas comprise a heterogeneous group of rare tumors, and detailed data are important for predicting prognosis as well as monitoring and improving quality of treatment. These data can be obtained from appropriately designed clinical databases.

Clinical databases hold data on disease characteristics, possible prognostic factors, treatment, and follow-up, as well as the indicators used to monitor quality of treatment. Use of such databases ensures large sample sizes, long follow-up periods, and high external validity. Validation of data in the few existing sarcoma databases worldwide is either not reported or not done, although it is a crucial factor determining the quality and value of the results reported. Even when it is reported to be done, the level of validation may be unclear and is suspected to be on a group level.Citation1Citation3

In Denmark there is a unique opportunity to validate data on the individual level. Further, use of clinical databases can be extended by unambiguous linkage of data with other Danish population-based data sources in order to achieve complete follow-up of patients and detailed data on, eg, incidence of diseases and changes over time. A preliminary study from the Sarcoma Centre of Aarhus University Hospital reported an incidence of 16 soft tissue sarcomas per million; however, the incidence of sarcoma has not previously been determined systematically in Denmark.Citation4,Citation5

The aim of the present study was to validate data in the Aarhus Sarcoma Registry (ASR), to assess if the ASR is population-based for western Denmark, and to examine the incidence of sarcoma using validated population-based registry data.

Materials and methods

Danish health care system

The population of Denmark is approximately 5.5 million.Citation6 The health care system provides tax-supported health care for all residents, allowing free access to public and private general practitioners and hospitals. Since 1968, all citizens in Denmark have been assigned a unique 10-digit civil personal registration (CPR) number, which goes throughout all the Danish administrative registries and clinical databases, allowing for unambiguous linkage on an individual level and tracking of patients who have died, emigrated, or been admitted to different hospitals.Citation7

Data sources

Since 1979, the treatment of patients with sarcoma in western Denmark (population 2.5 million) has been carried out at the Sarcoma Centre of Aarhus University Hospital, resulting in initiation of the ASR. All patients treated for soft tissue sarcoma, bone sarcoma, some borderline (World Health Organization [WHO] classification: intermediate malignancy, locally aggressive and rarely metastasizing)Citation8 and benign tumors, eg, desmoid-type fibromatosis and aneurysmal bone cyst, have been registered in the ASR. The ASR collects basic patient data, including CPR number, gender, county of residence, date of diagnosis, specific data on tumor characteristics and treatment, including tumor size, localization, histological type, tumor grade, stage of disease, date and type of treatment, as well as data on follow-up examinations, local recurrence, distant metastases, and death. From 1993, data were registered prospectively.

The Danish Cancer Registry has recorded all incident cases of cancer in Denmark since 1943. The main variables are CPR number, date of diagnosis, clinical stage, initial treatment, topography codes according to the tenth version of the International Classification of Diseases (ICD-10), and morphology codes according to the third version of the International Classification of Diseases for Oncology (ICD-O-3). Until 1987, reporting to the Danish Cancer Registry was voluntary, but then became mandatory for doctors in hospital departments and private medical specialists.Citation9,Citation10 Since 2004, registration has been done electronically and the completeness of patient registration and validation of data has been ensured by cross-referencing data from the Danish Cancer Registry to data from three national registries: the National Patient Registry, which contains data on all patients admitted to any hospital department, including discharge diagnoses; the Danish Pathology Registry, which contains histological diagnosis, anatomical localization, and date at diagnosis for both benign and malignant pathology; and the Danish Cause of Death Registry, which contains date and immediate and underlying cause of death.

Data on the population size in western Denmark was obtained from StatBank Denmark, a database containing detailed statistical information on Danish society, including number of citizens per calendar year.Citation11

Study population

Patients were included in the study if they fulfilled the following inclusion criteria: residence in western Denmark, a diagnosis of sarcoma, treatment at the Sarcoma Centre of Aarhus University Hospital in the period between January 1, 1979 and December 31, 2008, and/or registration in the ASR. Patients with nonsarcoma pathology, or borderline or benign tumors, were excluded from our analyses (). Furthermore, we excluded patients with tumors not located in the extremities or trunk (). The remaining 1827 (47.3%) patients with bone sarcoma and soft tissue sarcoma in the trunk or extremities formed our study population ().

Figure 1 Flow chart for patients registered in the ASR and DCR in the period 1979–2008, with number of patients (N), reasons for exclusion, and distribution across registries. Includes patients only registered in the ASR (ASR/DCR), patients only registered in the DCR (DCR/ASR), and patients registered in both registries (ASR∩DCR).

Notes: *Based on the WHO ICD-03 codes. Non-sarcoma pathology includes benign, borderline and some malignant tumors (). This population is not complete.
Abbreviations: ASR, Aarhus Sarcoma Registry; DCR, Danish Cancer Registry; WHO, World Health Organization.
Figure 1 Flow chart for patients registered in the ASR and DCR in the period 1979–2008, with number of patients (N), reasons for exclusion, and distribution across registries. Includes patients only registered in the ASR (ASR/DCR), patients only registered in the DCR (DCR/ASR), and patients registered in both registries (ASR∩DCR).

Validation of ASR

Initially, a revision of the variables registered and the registration forms used in the ASR was made. A random sample was selected using the first two digits of the CPR numbers, indicating the day of birth, and choosing the 100 patients with the lowest numbers. Based on a revision of these patients and a review of the literature regarding prognostic factors, standardized registration forms were designed in cooperation with a surgeon, an oncologist, and a pathologist with expertise in sarcoma.

Medical files, considered to be the gold standard for this type of research, were reviewed for all patients registered in the ASR by two independent researchers not involved in treatment using the standardized forms. Based on medical files from both the Department of Orthopedic Surgery and the Department of Oncology, data that were missing or incorrect in the ASR were added or corrected. All uncertain cases were discussed with the same team of sarcoma experts and a consensus was reached.

Finally, 385 patients with sarcoma were identified either through the Department of Pathology or in the Danish Cancer Registry (DCR)Citation10 as treated at the Sarcoma Centre of Aarhus University Hospital in the study period but yet not registered in the ASR. These medical records were reviewed in the same manner and entered in the ASR.

To assess whether or not the validation process has resulted in substantial changes in the quality of the ASR, a data extract including important variables selected from the ASR before and after validation was used. The proportion of correctly registered variables before validation was assessed. Correctness of the data was defined as the amount of registered data not changed during validation divided by the number of registered data before validation. Completeness of data in each selected variable before and after validation was assessed and compared. Completeness of data was defined as the amount of registered data divided by the number of registered patients.

Completeness > 90% was considered satisfactory for the important variables selected as well as patient registration (described below), with ≤3% missing values in the demographic data, in agreement with the requirements for clinical databases approved by the Danish National Board of Health and the North American Association of Central Cancer Registries.Citation12,Citation13

Completeness of patient registration in ASR

The completeness of patient registration in the ASR was assessed using the DCR as a reference. Data were retrieved from the DCR, including 2260 patients living in western Denmark at the time of diagnosis with a tumor located in soft tissue or bone, based on the following ICD-10 codes: C40, C41, C47, and C49. These data were reviewed by one pathologist with expertise in sarcoma, and 1100 patients with nonsarcoma pathology () were excluded from the DCR.Citation10,Citation14 Furthermore, based on the Danish Pathology Registry, patients with tumors not located in the extremities or trunk were excluded, as in the ASR (). Finally, patients in whom sarcoma was found incidentally at autopsy and patients treated only after the study period were excluded from the DCR (). Data from the DCR was linked on an individual level with data from the ASR. The completeness of patient registration in the ASR (P) was estimated as the number of the patients registered in both the ASR and DCR (ASR ∩ DCR) divided by the total number of patients registered in the DCR (DCR): P=ASRDCRDCRDifferences in completeness of patient registration according to gender, age group (<20, 20–39, 40–59, and ≥60 years), region (south, central, and north), and 5-year periods (1979–1983, 1984–1988, 1989–1993, 1994–1998, 1999–2003, and 2004–2008) were examined using the Chi-squared test, with a P value less than 0.05 considered to be statistically significant.

The number of expected patients with bone and soft tissue sarcoma not registered in either the ASR or the DCR (N) was determined using the capture-recapture method,Citation13,Citation14 and defined as the number of patients only registered in the ASR (ASR/DCR) multiplied by the number of patientsonly registered in the DCR (DCR/ASR) divided by the number of patients registered in both the ASR and DCR: N=ASR/DCR*DCR/ASRASRDCR

Incidence

Incidence was calculated using data from the ASR and StatBank Denmark, expressed per 100,000 inhabitants per year with 95% confidence intervals (CI). Crude and age-standardized incidence rates using the WHO standard population were calculated as overall, and separately for bone sarcoma and soft tissue sarcoma.Citation15

The changes in age-standardized 5-year incidence rate for gender and age were assessed as an incidence rate ratio with 95% CI using Poisson regression, and a P value of 0.05 was considered to be statistically significant. In addition, the age-specific and gender-specific incidence rate was calculated as overall and sarcoma-specific.Citation11

Ethical approval

This study was approved by the Danish Data Protection Agency (2007-58-0010) and the Danish Health and Medicines Authority (7-604-04-2/262/KWH).

Results

ASR patient demographics

In total, 1827 patients with soft tissue sarcoma or bone sarcoma were treated at the Sarcoma Centre of Aarhus University Hospital in the period 1979–2008. The majority of patients (824, 45.0%), were diagnosed in the 1999–2008 period compared with 446 (24.4%) and 557 (30.5%) in 1979–1988 and 1989–1998, respectively. The median age at diagnosis was 53.4 years (range 0.1–95.5), with 1013 (55.5%) males compared with 814 (44.5%) females. There were 1275 patients (70.0%) with soft tissue sarcoma and 552 (30.0%) with bone sarcoma.

Validity of ASR

Validation increased the number of patients with bone sarcoma and soft tissue sarcoma in the ASR from 1442 to 1827 (an increase of 26.7%), and completeness of the registered variables increased from 83.7% (95% CI 83.1–84.2) to 99.3% (95% CI 99.2–99.4), as shown in . Before validation, the proportion of correctly registered data for the variables varied from 25.3% (95% CI 22.4–28.2) to 97.7% (95% CI 96.7–98.5), with an average of 70.5% (95% CI 69.8–71.2), .

Table 1 Completeness and correctness of data by selected variables in the Aarhus Sarcoma Registry

Completeness of patient registration in ASR

We identified 2078 patients with bone sarcoma or soft tissue sarcoma in western Denmark in the period 1979–2008 who were registered in both the ASR and DCR (), giving an overall completeness of registration in the ASR of 85.3% (95% CI 83.5–86.9). A total of 369 patients were registered in the ASR but not in the DCR. There was no difference in completeness according to gender, but completeness was significantly greater for those younger than 60 years than for those older than 60 years, in patients from the central region compared with the north and south regions and in patients diagnosed after 1994 compared with those diagnosed before 1994 ().

Table 2 Completeness of patient registration in the Aarhus Sarcoma Registry by gender, age, region, and time period

The expected number of patients with bone sarcoma or soft tissue sarcoma in the trunk or extremities in western Denmark in 1979–2008 not registered in either the ASR or DCR was 64. Of these, 40 patients would be expected in 1979–1988, while 18 and nine would be expected in 1989–1998 and 1999–2008, respectively.

Overall incidence

The crude overall incidence rate of sarcoma in the ASR for the period 1979–2008 was 2.5 (95% CI 2.4–2.6) per 100,000 inhabitants per year. The WHO age-standardized overall incidence rate was 2.2 (95% CI 2.1–2.3). The crude and WHO age-standardized incidence rate was 0.8 (95% CI 0.7–0.9) and 0.8 (95% CI 0.7–0.8), respectively, for bone sarcoma and 1.8 (95% CI 1.7–1.9) and 1.4 (95% CI 1.3–1.5) for soft tissue sarcoma, respectively.

The incidence rate for bone sarcoma in both genders was highest at age 10–24 years and at age 75–79 years. The highest incidence rate for soft tissue sarcoma was found at age 70–74 years for females and age 85–90 years for males ().

Figure 2 Crude IR per 100,000 inhabitants and World Health Organization age-standardized IRR, with 95% CI and P values, for bone and soft tissue sarcoma in the Aarhus Sarcoma Registry from 1979 to 2008 according to age (years) and gender (n = 1827).

Abbreviations: CI, confidence intervals; IR, incidence rate; IRR, incidence rate ratios.
Figure 2 Crude IR per 100,000 inhabitants and World Health Organization age-standardized IRR, with 95% CI and P values, for bone and soft tissue sarcoma in the Aarhus Sarcoma Registry from 1979 to 2008 according to age (years) and gender (n = 1827).

The WHO age-standardized incidence rate for bone sarcoma decreased with increasing age, with an incidence rate ratio of 0.30 (95% CI 0.31–0.62, P < 0.001) for females and 0.59 (95% CI 0.42–0.82, P = 0.002) for males older than 60 years versus younger than 20 years. The opposite was found for soft tissue sarcoma, for which the WHO age-standardized incidence rate increased with increasing age, with an incidence rate ratio of 4.46 (95% CI 3.09–6.45, P = 0.000) for females and 4.55 (95% CI 3.47–5.97, P < 0.001) for males (). The overall incidence rate ratio was 1.3 (95% CI 1.2–1.5, P < 0.001) for males compared with females.

Change in incidence rate over time

The WHO age-standardized overall incidence rate increased over the entire study period, with an incidence rate ratio of 1.80 (95% CI 1.51–2.15, P < 0.001) in 2004–2008 compared with 1.00 (reference) in 1979–1983 (). A steady increase in incidence rate was seen in soft tissue sarcoma, whereas the incidence rate of bone sarcoma only increased significantly in the last time period compared with the first time period (). The increase in incidence rate over time was highest in the age group 40–59 years in males (incidence rate ratio 2.9; 95% CI 1.8–4.6, P < 0.001) and 20–39 years in females (incidence rate ratio 3.5; 95% CI 1.8–6.8, P < 0.001) in 2004–2008 compared with 1979–1983 ().

Figure 3 World Health Organization age-standardized IR per 100,000 inhabitants and IRR, with 95% CI and P values for overall, bone, and soft tissue sarcoma in the Aarhus Sarcoma Registry from 1979 to 2008 (n = 1827).

Abbreviations: CI, confidence intervals; IR, incidence rate; IRR, incidence rate ratios.
Figure 3 World Health Organization age-standardized IR per 100,000 inhabitants and IRR, with 95% CI and P values for overall, bone, and soft tissue sarcoma in the Aarhus Sarcoma Registry from 1979 to 2008 (n = 1827).

Table 3 Age-specific and gender-specific incidence rates per 100,000 inhabitants per year by time period

Discussion

Validity of ASR

The validation process eliminated incorrect registration and values missing from the ASR. This emphasizes the importance of validation, as well as ensuring the completeness of patient registration by continuous cross-referencing to other data sources. The overall completeness of the variables registered in the ASR is now considered to be very satisfactory.

Completeness of patient registration in ASR

Validation increased the number of patients registered in the ASR by 26.7%. The overall completeness of patient registration in the ASR of 85.3% appears to be in agreement with the completeness reported for the few other sarcoma registries. For example, the Scandinavian Sarcoma Group Register, a registry including sarcoma patients from Sweden, Norway, and Finland, reported a completeness of at least 90%.Citation1Citation3 However, it was not possible to determine whether the analyses of patient completeness for other sarcoma registries were based on individual or group levels. A comparison on the group level may result in misleading estimates, because it would be unclear whether an individual patient was registered in both data sources.

Completeness was improved after 1993, which is probably attributable to retrospective registration of patients before 1993 and prospective registration afterwards. Some of the old medical files were lost, owing to the legal obligation in Denmark to store medical files for 10 years only, making them difficult to register and validate in the ASR. Furthermore, it is only recently that a tradition of centralizing the treatment of rare diseases, such as sarcomas, has been established in Denmark.

The lesser completeness in patient ≥ 60 years may be explained by the fact that comorbidity and/or metastasis in this age group would mean no treatment strategy, and thus not being referred to the Sarcoma Centre of Aarhus University Hospital. The lesser completeness among patients 60 years and older could be a source of selection bias in this study in relation to estimates of incidence as well as in future survival studies based on ASR patients. However, the size of the possible selection bias would be expected to be low.

The lesser completeness of data in the southern region is most likely explained by the fact that some patients in this region were referred incorrectly to a nearer sarcoma centre in Odense. The number of patients not registered in either the ASR or DCR was estimated to be approximately two patients per year. Although ideally this number should be 0, we consider it acceptable and conclude that ASR is a population-based registry for western Denmark.

Incidence

The estimated crude and age-standardized incidence rates of 2.5 and 2.2 per 100,000 were in general agreement with results from other studies; however, comparability is difficult because of differences in the methods of reporting data and inclusion criteria, such as age, histological subtypes, and anatomical localization.Citation16 Two European studies that also included visceral sarcomas reported an overall incidence rate of 2.9 and 6.2 sarcomas per 100,000 per year in 1982–1984 and 2005–2007, respectively.Citation17,Citation18 A Swedish study of soft tissue sarcoma excluding visceral sites reported an incidence of 1.8 per 100,000 per year in the period 1964–1989, consistent with our findings.Citation19

The increase in incidence found with higher age and male gender was also in accordance with results from other research.Citation1,Citation17,Citation18,Citation20,Citation21

Other studiesCitation1,Citation20 have shown an increase in incidence in soft tissue sarcoma over time, consistent with our findings. However, it can be questioned whether our results represent a true change in incidence or rather reflect an increase in completeness of patient registration in the ASR. The latter can be supported by the fact that the expected number of patients not registered in either the ASR or the DCR dropped by more than four-fold when comparing the first with the last time period. However, even when accounting for the expected number of non registered patients in the three 10-year time periods (1979–88, 1989–98 and 1999–2008), respectively, the total number of patients increased over time (579 + 40 versus 642 + 18 versus 858 + 9), which could indicate a real change in incidence. The apparent decrease in incidence in the last 2 years of the study period could be explained by common variation or possibly a small diagnostic delay, ie, patients diagnosed with sarcoma retrospectively at revision of histopathology later on.

Methodological considerations

The risk of information bias in this study was low, given that the medical files were systematically reviewed using standardized forms and by only two persons in close collaboration. A manual with definitions of variables was elaborated, and all uncertain cases were discussed by a multidisciplinary team, and consensus was reached. Medical files were used to validate data in the ASR. Even though there is a possibility that correct data in the ASR were replaced by incorrect data from the medical files, this is considered very unlikely given the fact that medical files are generally dictated immediately after the consultation, whereas registration in the ASR is more likely to be done later on.

Although DCR was considered to be the gold standard for assessing completeness of patient registration in ASR, we found 369 cases in the ASR which were not registered in the DCR. Differences in patient registration in the ASR and DCR may be attributable to various factors: firstly, incorrect diagnoses, where the patient is initially diagnosed with sarcoma and registered, while not deleted when a later revision shows nonsarcoma histology; secondly, differences in inclusion criteria, where some borderline or low-grade sarcoma types are included and registered in, eg, the ASR but not in the DCR; thirdly, missing registrations, where it is especially important that reporting to the DCR until 1987 was voluntary, resulting in less patient completeness in the DCR for that period. However, even though numerous studies have shown high quality of the data in the DCR, our results show that the DCR is not a perfect reference for sarcoma, and validation of the sarcoma data in the DCR is warranted.Citation22Citation25

Conclusion

The incidence of bone sarcoma and particularly soft tissue sarcoma increased substantially during the period 1979–2008. The increase in incidence rate over time is gender-dependent and age-dependent. The validation process significantly improved the completeness of variables and the quality of the ASR data. The ASR is now a population-based, validated, and valuable tool for epidemiological research and improvement in quality of treatment for sarcoma. It is our recommendation that documented validation of registries should be a prerequisite for publishing studies derived from them.

Acknowledgements

The study was supported by grants from the “Frits, Georg and Marie Cecilie Gluds legat”, the “Max and Inge Wørzners mindelegat”, the Danish Council for Independent Research/Medical Sciences, and Aarhus University. We extend our sincere thanks to the orthopedic surgeons and oncologists from the Aarhus Sarcoma Centre who register data in the Aarhus Sarcoma Registry, as well as the staff from the archives for locating the medical files reviewed in this study.

Disclosure

The authors report no conflicts of interest in this work.

References

Appendices

Appendix A Histology subtypes, based on ICD-O-3 World Health Organization codes, excluded from both the Aarhus Sarcoma Registry and the Danish Cancer Registry in this study

Appendix B Topography in the Aarhus Sarcoma Registry and the Danish Cancer Registry excluded from the study