1,521
Views
10
CrossRef citations to date
0
Altmetric
ORIGINAL ARTICLE

Accuracy of Finnish Cancer Registry colorectal cancer data: a comparison between registry data and clinical records

ORCID Icon, , , , &
Pages 247-251 | Received 24 Sep 2020, Accepted 17 Dec 2020, Published online: 06 Jan 2021

Abstract

Background

The population-based Finnish Cancer Registry (FCR) is an important resource for research and healthcare politics in Finland. The aim of this study was to validate the accuracy of the colorectal cancer (CRC) data within the FCR.

Material and methods

FCR data are based on independent cancer report forms (CRFs) from both clinicians and pathologists. Data from patients diagnosed with CRC during a randomized, population-based CRC screening program between 2004 and 2012 were extracted from the FCR and compared to data extracted from the original clinical patient records of these individuals by two gastrointestinal surgeons. The study focused on tumour characteristics and primary treatment. Accuracy was measured by calculating Cohen’s kappa coefficient (κ), which considers the possibility of agreement by chance.

Results

Altogether, 1475 patients were studied. κ was 0.74 for stage, 0.87 for tumour location (right/left), 0.78 for a more detailed location, 0.72 for tumour histology, 0.46 for surgical removal of the primary tumour, and 0.43 for chemotherapy. Among those who underwent surgery, the radicality of surgical treatment had a κ of 0.24. In total, 173 (12%) patients were lacking a CRF from a clinician.

Conclusion

The FCR data had good accuracy regarding tumour characteristics, but poor accuracy in treatment information. The main reason for this suboptimal accuracy was missing CRFs from treating clinicians. Awareness of these findings is crucial when research and decision making is based on FCR data. Measures have since been taken to improve the completeness of FCR recording.

Introduction

National cancer registries play a key role in cancer control programs worldwide, providing valuable information concerning cancer burden and grounds for statistics and research. The Finnish Cancer Registry (FCR) is a population-based national registry established in 1952. The FCR maintains data from cancers diagnosed and treated in Finland since 1953. Special legislation obliges healthcare organizations to report all new cancer diagnoses to the FCR [Citation1]. The FCR summarizes information from each cancer case, and registry data are widely used for statistics and cancer research.

The International Agency for Research on Cancer has defined five areas of consideration concerning cancer registration, including completeness in coverage, completeness in detail, accuracy in detail, accuracy of reporting, and accuracy of interpretation [Citation2]. Completeness of data, data quality, and quality control in Finland have been described by earlier studies [Citation3,Citation4]. New cancer diagnoses are captured by the registry close to completion. The completeness of FCR data, compared to that in the Care Register for Health and Welfare (HILMO), was 95.9% for all solid malignant tumours and 97.4% for colorectal cancer (CRC) [Citation4], yet there have been no studies regarding the accuracy of more detailed information. According to recommendations for cancer registry quality control, the accuracy of a registry indicates its correspondence with the source documentation and is shown as the proportionate amount of correctly coded case attributes in the registry [Citation5].

The aim of this audit was to evaluate the accuracy of the FCR CRC data by comparing data from the FCR to data collected from original clinical patient records. This comparison focuses on the accuracy of tumour characteristics (stage, tumour location within the large bowel, and histopathology) and primary treatment information: whether the primary tumour was removed surgically, whether surgical treatment was radical, and whether the patient had received any chemotherapy.

Material and methods

Patients

Our study population comprised patients diagnosed with CRC during the randomized population-based health services program on CRC screening by faecal occult blood test (FOBT; Hemoccult®) in Finland [Citation6]. The target population for CRC screening was men and women aged 60–69 years. Details of the screening program arrangements have been previously published [Citation6–9]. Altogether, 1475 patients diagnosed with CRC from 2004–2012 were identified from the FCR, and copies of the clinical documents were requested from the hospitals treating these patients. The cohort included all CRC cases from the target population, both screen-detected and non-screen-detected. Data from clinical documents were manually extracted by two gastrointestinal surgeons [Citation10].

Variables

The FCR classifies the extent of disease into six categories (). Although the TNM classification has been requested since the early 1970s, most CRFs have provided only an approximate stage or parts of the TNM, such as T alone; therefore, TNM data is not available in the FCR [Citation11]. Cancer extent from the clinical records was extracted based on both clinical and pathological Union for International Cancer Control (UICC) TNM staging and stage (8th edition) [Citation12]. If both records were available, the pathological records were chosen, except among patients with neoadjuvant treatment before operation or distant metastases detected by computed tomography. TNM stage was converted into the six-category FCR classification for comparison ().

Table 1. The FCR’s cancer extent classification system and its relation to UICC TNM staging (8th edition).

The FCR codes the tumour location (topography) and histological diagnosis (morphology) according to the ICD-O-3 [Citation13]. We simplified the division of sub-sites to ‘right-sided’ (appendix, caecum, ascending colon, and transverse colon) and ‘left-sided’ (descending colon, sigmoid colon, rectosigmoid junction, rectum, anal canal, and anus) and compiled histological diagnoses into subgroups: 1) adenocarcinoma, 2) mucinous adenocarcinoma (including pseudomyxoma peritonei with unknown primary site), 3) neuroendocrine carcinoma, 4) anal epidermoid carcinoma, 5) other, and 6) unknown.

We defined surgical treatment as the removal of the primary tumour. Radical surgery in the manually extracted clinical records was defined as complete tumour removal with >1 mm tumour-free margins. In the FCR, the radicality of surgery was based on information given in the CRF (radical, palliative, not known if radical or palliative). Chemotherapy included both chemotherapy and chemoradiotherapy administered before or after surgical treatment, as well as any such therapy administered for metastases.

Statistical analysis

The inter-rater agreement was measured by calculating Cohen’s kappa coefficient (κ) for FCR data against the data based on clinical records. Cohen’s κ varies from 0 to 1, where, according to Landis and Koch’s scale, 0 indicates agreement equivalent to chance; 0.1–0.20, slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement; 0.81–0.99, near-perfect agreement; and 1 perfect agreement [Citation14]. Sensitivities for each subgroup were calculated. Sensitivity was considered as the ability of the FCR to classify a cancer case in the same subgroup as in the medical patient records. It was calculated by dividing the proportion of correctly classified cancers in the FCR by the total number of cancers in the subgroup based on medical records.

Finally, we calculated a new κ based on data from patients with clinical CRFs available and compared them to the original κ coefficients.

Patients’ personal data were only used for linkage between FCR data and original medical records and deleted before analysis. Statistical analyses were performed with R-studio version 3.5.1.

This study was approved by the National Institute of Health and Welfare; THL/923/5.05.00/2014, updated on May 31, 2018.

Results

The study population comprised 1475 patients diagnosed with CRC between September 2004 and May 2012. There were 872 (59%) male and 603 (41%) female patients. Tumour stages were I, 309 (20%); II, 336 (23%); III, 399 (27%); and IV, 316 (21%). Of all tumours, 1051 (71%) were left-sided. Histology confirmed adenocarcinoma for 1261 (86%) patients; other histological subgroup included mucinous adenocarcinoma, 120 (8.1%); neuroendocrine carcinoma, 26 (1.8%); squamous cell carcinoma, 19 (1.3%); and other, 15 (1.0%).

Agreement for stage was 0.74 (95% confidence interval [CI], 0.72–0.77). The sensitivity for stage in the FCR was 78% for localized, 80% for locally advanced, 75% for lymph node-involved, and 65% for metastatic cancers (). The stage was missing or unknown for 156 (11%) patients in the FCR and 48 (3.3%) patients in the medical records.

Table 2. Sensitivity of the Finnish Cancer Registry cancer stage at the time of diagnosis compared to the clinical records of the same individuals, 2004–2012.

The agreement for right- or left-sided tumour location was 0.87 (95% CI, 0.85–0.90). Sensitivity was 90% for right-sided tumours and 96% for left-sided tumours. The agreement for more detailed location was 0.78 (95% CI, 0.75–0.80). The sensitivity for detailed location was highest for the rectum (93%) and lowest for the rectosigmoid junction (55%) and descending colon (63%) compared to those of other subsites (70%–89%). The subsite was unknown or missing for 70 (4.7%) patients in the FCR and 5 (0.3%) patients in the medical records. The histological diagnosis of CRC had a κ of 0.72 (95% CI, 0.67–0.77). The sensitivity was 97% for adenocarcinoma and 68% for mucinous adenocarcinoma.

Concerning primary tumour removal by surgery, the κ was 0.46 (95% CI, 0.40–0.51). The information regarding completed primary tumour removal was correct for 91% of patients, but the information on those that had not undergone primary tumour removal was only 54% correct, and 25% of patients without operations were classified as ‘operated on’ in the FCR (). Of 151 ‘non-operated on’ patients, 52 (34%) had undergone an abdominal operation, but without tumour removal according to medical records (e.g. stoma, stent, bypass, or exploratory laparotomy). The FCR had classified 30 (58%) of these patients as having undergone primary tumour removal. There was unknown or missing surgical information for 134 (9%) patients in the FCR and 8 (0.5%) patients in the medical records. The radicality of tumour resection had a κ of 0.24 (95% CI, 0.20–0.27). The sensitivity was 64% for radical and 50% for non-radical tumour resection. In the FCR, radicality information was recorded as unknown or missing for 342 (31%) patients, and for 7 (0.5%) patients in the medical records.

Table 3. Sensitivity of the Finnish Cancer Registry data concerning primary tumour removal compared to clinical records of the same individuals, 2004–2012.

The agreement for chemotherapy was 0.47 (95% CI, 0.43–0.51). The sensitivity of reports of no chemotherapy was 83% and the sensitivity for administered chemotherapy was 60%. Of the 819 patients administered chemotherapy, 264 (36%) had been miscategorized in the FCR as ‘chemotherapy not administered’ (). There was unknown or missing adjuvant therapy status for 134 (9%) patients in the FCR and 87 (6%) patients in the medical records.

Table 4. Sensitivity of the Finnish Cancer Registry data concerning administered adjuvant therapy compared to clinical records of the same individuals, 2004–2012.

A clinical CRF was received for 1302 (88%) patients. Adjusting for only patients with clinical CRFs available, the kappa-values improved in every category except histological type ().

Table 5. Cohen’s kappa coefficient of the Finnish Cancer Registry data on colorectal cancer patients diagnosed in 2004–2012.

Discussion

This study evaluated the accuracy of the Finnish Cancer Registry for colorectal cancer by comparing data from the FCR to data collected from original medical patient records. The accuracy of the FCR data showed great variability within the variables in question with κ coefficients ranging from 0.24–0.87. Overall, the FCR data was of good accuracy regarding tumour characteristics, but poor for treatment information.

We observed that a high proportion of the clinical CRFs were missing (12%), which was the main reason for discrepancies between the datasets. After including only patients with a completed CRF, the agreement between datasets improved, with κ ranging from 0.31–0.90. This improvement was most prominent in variables dependent on clinical information, such as given treatment. However, concerning histological diagnosis, completed CRFs did not improve accuracy because the FCR receives independent histology information directly from pathology laboratories.

The FCR have recorded all primary malignancies from individuals in Finland since 1953. Clinical cancer report forms (CRFs) have been primarily paper-based but, since 2016, a fully electronic process has been aimed for. Pathology CRFs have mainly been reported electronically since the mid-1980s. Statistics Finland supplies information regarding the causes of death and death certificates for all registered patients with cancer if cancer is mentioned as the cause of death.

The CRFs received are verified in the FCR, and the clinical and pathological information is combined and linked with data from the Population Register Centre on vital status and residence. Data is coded according to international cancer registry guidelines [Citation15]. From 1953–2017, coding was entirely manual and performed by FCR internal medical coders [Citation4]. Complete registration comprises diagnostic (stage, location of the tumour, histological diagnosis) and primary treatment information.

To our knowledge, this study is the first evaluating FCR data on several key tumour characteristic and primary treatment variables as compared to the original medical patient records. Additionally, we studied the effects of missing clinical notifications on data accuracy. Previously, FCR data was validated by reporting the proportion of morphologically verified tumours (MV%), death certificate only registrations (DCO%), and completeness [Citation4]. For colon, rectum/rectosigmoid, and anus tumour sites, the MV% were 95.8%, 97.1%, and 97.0%, respectively, and DCO% were 1.8%, 1.1%, and 0.0%, respectively [Citation4]. It has been previously reported that stage and treatment data were difficult to accurately record in the FCR [Citation3].

Overall, we found high sensitivities for staged colorectal cancers. The primary sources of inconsistency were the higher number of cancers categorized as ‘unknown’ stage in the FCR (10.6%) compared to clinical records (3.3%) and patients in the FCR’s categories of ‘metastatic or locally invasive’ and ‘advanced’, which included patients with incomplete TNM staging received by the FCR.

According to Leinonen et al. [Citation4], only 1.9% of new malignant tumours were registered as ‘primary site unknown’ in the FCR from 2009–2013. We studied detailed CRC tumour locations within the large bowel and found very high sensitivity for most sites, excluding the rectosigmoid (55%) and descending colon (63%). Of all rectosigmoid cancers, 19% were classified as ‘sigmoid’ and 25% as ‘rectal’. After adjusting for this, the sensitivity of the rectosigmoid area improved to 99%. When we studied FCR’s ability to differentiate tumour lateralization in the large bowel (right-sided versus left-sided), sensitivities were remarkably high (90% for right and 96% for left). This is of high importance according to recent data regarding putative differential mechanisms and prognoses of right- and left-sided tumours [Citation16].

The FCR’s classification of ‘non-operated on’ patients was not reliable, with a poor sensitivity of 54%, but patients who underwent surgery were mostly classified correctly (sensitivity 91%). One source of error was that over half of the patients who underwent surgery without primary tumour removal were counted as ‘operated on’ in the FCR. Furthermore, the radicality of surgical treatment had the lowest accuracy rate, and the proportion of unknown or missing values in the FCR was the highest among the studied variables.

For chemotherapy, ‘not administered’ was registered more correctly (83% sensitivity) than ‘chemotherapy administered’ (60% sensitivity). Shockingly, 36% of patients who had received chemotherapy were registered as ‘not administered’. One explanation for this is that missing chemotherapy information in the CRF is coded as ‘no therapy administered’ if surgery or any other treatment has been performed.

The study population consisted of patients who were involved in the randomized CRC screening program in Finland, and it was assumed that medical patient records were captured perfectly and without error. It is unlikely that more reporting to the FCR took place in the municipalities included in the CRC screening program compared to municipalities outside the program because the reporting processes were independent. Furthermore, identified missing CRFs were evenly distributed throughout Finland.

We chose to use Cohen’s κ over direct percentage agreement statistics. The key limitation of percentage agreement is that it does not incorporate a chance agreement between raters and may overestimate true accuracy. However, κ has limitations. κ is only calculated for common variables and measures their agreement. Concerning stage, the κ statistic did not integrate the categories of ‘advanced’ and ‘metastatic or locally advanced’, which did not have values according to the clinical records. It is worth noting that Cohen’s κ usually produces lower values compared to percentage agreement due to the factor of agreement by chance [Citation17].

Conclusions

The FCR data achieved commendable accuracy in detailed information; however, we emphasize that improvements must be made to increase the usefulness of the data, especially for clinical research purposes. As a population-based cancer registry, the FCR plays a role in cancer control: to assess the causes and impacts of cancer burden and the effectiveness of screening programs, and to provide grounds for cancer research. The FCR has the aim of providing key cancer-related variables with high coverage and accuracy; thus, complete and accurate data are required for the registry to fulfil its tasks. We showed that missing clinical information leads to an increase in erroneous data in some aspects, including primary treatment. Several key variables rely solely on clinical information, and clinical CRFs are essential for the FCR to obtain high-quality data.

Abbrevations
FCR=

Finnish Cancer Registry

CRC=

colorectal cancer

CRF=

cancer report form

HILMO=

Care Register for Health and Welfare

FOBT=

faecal occult blood test

ICD-O-3=

International Classification of Diseases for Oncology, third edition

UICC=

Union for International Cancer Control

κ=

Kappa coefficient

CI=

confidence interval

Disclosure statement

The authors report no conflicts of interest that would have biased the work.

Additional information

Funding

This work was funded by Helsinki University Hospital’s Research grants and the Cancer Society of Finland. The funders had no involvement in design, data collection, findings or decision to publish.

References

  • Ministry of Social Affairs and Health. Laki terveydenhuollon valtakunnallisista henkilörekistereistä 1989. 2019 [cited 2019 Dec 19]. Available from: http://www.finlex.fi/fi/laki/alkup/1989/19890556
  • Jensen OM, Parkin DM, Maclennan R, et al. Cancer registration: principles and methods. Lyon (France): International Agency for Research on Cancer; 1991.
  • Teppo L, Pukkala E, Lehtonen M. Data quality and quality control of a population-based cancer registry. Experience in Finland. Acta Oncol. 1994;33:365–369.
  • Leinonen MK, Miettinen J, Heikkinen S, et al. Quality measures of the population-based Finnish Cancer Registry indicate sound data quality for solid malignant tumours. Eur J Cancer. 2017;77:31–39.
  • Bray F, Parkin DM. Evaluation of data quality in the cancer registry: principles and methods. Part I: comparability, validity and timeliness. Eur J Cancer. 2009;45:747–755.
  • Malila N, Anttila A, Hakama M. Colorectal cancer screening in Finland: details of the national screening programme implemented in Autumn 2004. J Med Screen. 2005;12:28–32.
  • Malila N, Palva T, Malminiemi O, et al. Coverage and performance of colorectal cancer screening with the faecal occult blood test in Finland. J Med Screen. 2011;18:18–23.
  • Malila N, Oivanen T, Malminiemi O, et al. Test, episode, and programme sensitivities of screening for colorectal cancer as a public health policy in Finland: experimental design. BMJ. 2008;337:a2261.
  • Pitkäniemi J, Seppä K, Hakama M, et al. Effectiveness of screening for colorectal cancer with a faecal occult-blood test, in Finland. BMJ Open Gastroenterol. 2015;2:e000034.
  • Koskenvuo L, Malila N, Pitkäniemi J, et al. Sex differences in faecal occult blood test screening for colorectal cancer. Br J Surg. 2019;106:436–447.
  • Pukkala E, Engholm G, Højsgaard Schmidt LK, et al. Nordic Cancer Registries – an overview of their procedures and data comparability. Acta Oncol. 2018;57:440–455.
  • Amin MB, Edge S, Greene F, et al., editors. AJCC Cancer Staging Manual. 8th ed. New York (NY): Springer International Publishing; 2017.
  • Fritz AG, editor. International classification of diseases for oncology: ICD-O. 3rd ed. Geneva (Switzerland): World Health Organization; 2000.
  • Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174.
  • Tyczynski JE, Démaret E, Parkin DM, et al., European Network of Cancer Registries, World Health Organization, European Commission. Standards and guidelines for cancer registration in Europe: the ENCR recommendations. Vol. I. Lyon (France): International Agency for Research on Cancer; 2003.
  • Baran B, Mert Ozupek N, Yerli Tetik N, et al. Difference between left-sided and right-sided colorectal cancer: a focused review of literature. Gastroenterology Res. 2018;11:264–273.
  • McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22:276–282.