1,670
Views
14
CrossRef citations to date
0
Altmetric
ORIGINAL ARTICLES: BREAST CANCER

Validity of Danish Breast Cancer Group (DBCG) registry data used in the predictors of breast cancer recurrence (ProBeCaRe) premenopausal breast cancer cohort study

, , , , , , , , , & show all
Pages 1155-1160 | Received 20 Feb 2017, Accepted 29 Apr 2017, Published online: 06 Jun 2017

Abstract

Background: Validation studies of the Danish Breast Cancer Group (DBCG) registry show good agreement with medical records for adjuvant treatment data, but inconsistent recurrence information. No studies have validated changes in menopausal status or endocrine therapy during follow-up. In a longitudinal study, we validated DBCG data using medical records as the gold standard.

Material and methods: From a cohort of 5959 premenopausal women diagnosed during 2002–2010 with stage I–III breast cancer, we selected 151 patients – 77 estrogen-receptor-positive and 74 estrogen-receptor-negative – from three hospitals. We assessed the validity of DBCG registry data on patient, tumor, and treatment factors, and follow-up information on menopausal transition, changes in endocrine therapy, and recurrence. We computed positive predictive values (PPVs) with 95% confidence intervals (95%CI).

Results: Agreement was near perfect for tumor size, lymph node involvement, receptor status, surgery type, and receipt of radiotherapy, chemotherapy, or tamoxifen treatment. The PPV for a change in endocrine therapy in the DBCG was 96% (95%CI = 83, 100). The PPV for menopausal transition was 61% (95%CI = 42, 77). The PPV for DBCG-recorded recurrence was 100%. However, of 19 patients who had a recurrence documented in their medical record, 13 had the recurrence registered in DBCG.

Conclusions: DBCG data are valid for most epidemiological studies of breast cancer treatment. Data on menopausal transition may be less valid, though this interpretation depends on the suitability of medical records for making this assessment. Although recurrence is missing for some, this would not bias most ratio measures of association.

Introduction

Prospectively collected registry data are valuable for epidemiological research, as they reduce study costs. However, since the quality of such research depends on the robustness of the source registry data, it is essential to assess their validity [Citation1].

The Danish Breast Cancer Group (DBCG) registry was established in 1975 to standardize treatment, facilitate clinical trials, and monitor outcomes among Danish breast cancer patients [Citation2]. Ongoing patient registration began in 1977. Between 1977 and 2005, over 90% of Danish women diagnosed with breast cancer have been reported to the DBCG registry [Citation2,Citation3]. Since 2006, all patients with a record of invasive breast tumors in the Danish National Pathology Registry have been registered in the DBCG [Citation4]. About one-third of registered patients are enrolled in clinical trials [Citation3]. Prespecified data on tumor, treatment, and patient characteristics are collected, registered, and tracked according to standardized protocols, whether or not a patient participates in a clinical trial [Citation3]. Thus, clinical and epidemiologic research using DBCG data benefits from being population-based and having the data quality of a clinical trial.

A study comparing DBCG registry data (1983–1989) with a complete population-based oncology department database showed high quality of clinical data and complete registration for patients aged 18–69 years, but incomplete registration of patients aged over 70 years with advanced cancer stage or lower life expectancy [Citation5]. The validity of registry data also has been assessed using medical records and the Danish National Patient Registry (DNPR) as gold standards [Citation2,Citation6]. Data were concordant on adjuvant treatment, but less accurate on first event at follow-up (local, regional, or distant recurrence; second primary cancer; contralateral cancer). A validation study comparing registry data with hospital discharge records for low-risk node-negative patients diagnosed 1989–2001, found underreporting of recurrence and second primary cancers in the registry, due largely to early discontinuation of follow-up [Citation6].

The ongoing Predictors of Breast Cancer Recurrence (ProBeCaRe) longitudinal cohort study, based on DBCG registry data, is investigating molecular and other predictors of recurrence in premenopausal women. Here, we validate key variables (menopausal transition, changes in endocrine therapy, and breast cancer recurrence) in ProBeCaRe, using medical records as the gold standard. No previous validation study has focused on premenopausal women or validated changes in menopausal status and changes in endocrine therapy during follow-up.

Methods

The ProBeCaRe study cohort includes all premenopausal Danish women diagnosed with incident non-metastatic breast cancer in Denmark during 2002–2010, whose diagnosis was reported to the DBCG registry. Eligible patients were divided into two groups: (1) those with estrogen-receptor-positive tumors who received tamoxifen therapy (ER+/tamoxifen), and (2) those with estrogen-receptor-negative tumors who did not receive endocrine therapy (ER–/no tamoxifen). All others were excluded. Current follow-up in the cohort is through 1 July 2014.

Patients in the DBCG registry receive follow-up examinations twice yearly for the first 5 years after diagnosis and annually for the subsequent 5 years. The examinations involve clinical assessment for recurrence, and imaging studies or other diagnostic work-up if recurrence is suspected. Patients diagnosed with recurrence between follow-up examinations also are reported to the registry. We retrieved information from the registry on patient age, menopausal status at diagnosis, histologic tumor type and grade, lymph node status, tumor ER status, progesterone receptor status, human epidermal growth factor receptor-2 status, type of primary surgery (mastectomy or breast-conserving surgery), receipt of chemotherapy, radiotherapy, or endocrine treatment, change in menopausal status and endocrine therapy during follow-up, date and site of any recurrence, and date of death.

The Danish civil personal registration number, a unique personal identifier assigned to each Danish citizen at birth or upon immigration, is used in all Danish registries [Citation7]. We used it to link data in the DBCG registry with data in other Danish registries, including the DNPR. The DNPR has recorded all non-psychiatric hospital admissions since 1977 [Citation8] and hospital outpatient and emergency department contacts since 1995. We retrieved information from the DNPR on comorbid disease at time of breast cancer diagnosis and summarized comorbidity using the Charlson Comorbidity Index scores, classified as 0 (no comorbidity), 1 (mild comorbidity), 2 (moderate comorbidity), and 3 + (severe comorbidity) [Citation9].

Study population

We restricted this validation study to patients included in the ProBeCaRe premenopausal study cohort who were diagnosed at Aarhus, Aalborg, and Odense University Hospitals. These hospitals have contributed 356, 313, and 340 patients, respectively, to the ProBeCaRe study population. For each hospital, we stratified patients by diagnostic period (2002–2006, 2007–2011), ER/tamoxifen status, and stage (I, II, III), yielding 12 strata in each hospital. We assigned random numbers to each patient within each of the 36 strata, and each patient was ranked within her stratum using this number.

Medical record review

We completed a medical record review for the first five patients ranked within each stratum. If the first five reviews could not be completed in a given stratum (e.g., because of the unavailability of medical records), we selected the next ranked patient until five reviews within the stratum had been completed or available patients within the stratum were depleted.

Medical records were reviewed by one of two project nurses who were blinded to patient selection criteria. They used a standardized medical abstract form and accompanying codebook to guide the review (see Supplementary Online Content, which also shows the validated variables). We adapted the abstract form and codebook from similar research tools used in earlier studies of breast cancer patients [Citation10–12].

In the medical record review, a new tumor in the contralateral breast was considered a second primary cancer. The DBCG registry’s definition of breast cancer recurrence includes any local, regional or distant recurrence, or contralateral breast cancer, and registry records allowed us to distinguish between contralateral breast cancer and other types of recurrence.

Statistical analyses

We calculated the frequency and proportion of patients in the ProBeCaRe study cohort and in the validation study subset, within categories of the analytic variables. We assumed that the medical record contained perfectly classified (i.e., gold standard) information for each analytic variable. For each variable, we prepared a contingency table to compare the registry data with the medical record data. We calculated the percent agreement as the percentage of observations in the registry that were concordant with the medical record, and calculated Cohen’s kappa statistic [Citation13]. We calculated the positive predictive value (PPV) as the number of patients with a particular characteristic confirmed by the medical record divided by the total number of patients with that characteristic in the DBCG registry. We calculated the completeness of DBCG registration of variables listed in as the number of (for example) recurrences registered in the DBCG and listed in the medical records divided by the total number of recurrences documented in the medical records. We examined the clinical, demographic and treatment characteristics of patients with concordant and discordant recurrence data. We considered the last day of recorded follow-up in the DBCG registry as the date of recurrence, date of death, date when a patient went ‘off protocol’ (i.e., dropped out of treatment or follow-up program), or 10 years after her primary diagnosis.

To validate variables that may have changed during follow-up (changes in menopausal status and endocrine therapy), we allowed a 4-month agreement window between dates in the DBCG registry and the medical record. In sensitivity analyses, we changed this window from 1 month to up to 6 months. All analyses were conducted using SAS version 9.2, Research Triangle, NC, USA.

Results

This study was approved by the DBCG, the Danish Data Protection Agency (J.nr. 2012-41-1170) and the Ethical Committee of the Central Denmark Region (J.nr. 1-10-72-22-13). A total of 151 patients were included in this validation study. From the initial list of 180 eligible patients, 28 (16%) medical records were not available in the medical journal archives. Descriptive characteristics of the parent ProBeCaRe study cohort and the validation study subset are presented in . Overall, the characteristics of the validation study subset were similar to those of the total cohort. However, there were some differences – patients in the validation subcohort were younger (a higher proportion aged below 35 years, and a lower proportion aged 50 + years); a higher proportion of the validation subcohort had stage III disease at diagnosis; and a higher proportion of patients in the validation subcohort received mastectomy rather than breast conserving surgery compared with the entire ProBeCaRe study cohort. Patients in the validation subset were more likely to have some comorbid disease at diagnosis than patients in the entire ProBeCaRe study (18% vs. 8%). Due to the sampling criteria for the validation subset, ER status, diagnostic period, and stage characteristics were more uniformly distributed in the validation subset than in the cohort as a whole.

Table 1. Characteristics of the ProBeCaRe breast cancer cohort.

Table 2. Comparison of DBCG registry records with medical records as a gold standard.

We observed high agreement for baseline characteristics—menopausal status at diagnosis, tumor size, lymph node status, receptor status, surgery type (mastectomy versus breast conserving surgery), and the receipt of cancer-directed treatment, including chemotherapy, radiotherapy, and tamoxifen (Supplementary Online Content Table 1). Most patients (92%) eligible for endocrine therapy began tamoxifen within 12 months of primary breast cancer surgery.

Overall, the PPV for menopausal transition in the DBCG registry was 61% [95% confidence intervals (95%CI) = 42, 77)], kappa = 0.39 (95%CI = 0.20, 0.57). In total, 35 patients had menopausal transition recorded in the medical record, 17 had this registered in DBCG. Eleven patients had menopausal transition registered in DBCG without evidence of this in the medical record. The completeness of registration of menopausal transition in the DBCG was 49% (95%CI = 33%, 65%).

Among patients with ER + breast cancer (n = 77), 28 patients had a documented change from tamoxifen to an aromatase inhibitor in the medical record; in 25 of these patients, this information was recorded in the DBCG registry. The PPV for change in endocrine therapy in the registry was 96% (95%CI = 83, 100), kappa = 0.89, 95%CI = 0.78, 0.99). The completeness of recording of change in endocrine therapy in DBCG was 89% (95%CI = 74%, 97%).

The date of recurrence in the DBCG registry compared with the medical record showed 92% agreement when a 4-month window was used. Sensitivity analyses showed 69% agreement for a 1-month window, 77% agreement for 2- and 3-month windows, and 92% agreement for 4-, 5-, and 6-month windows. The PPV for breast cancer recurrence in the DBCG registry was 100%, kappa = 0.80, 95%CI = 0.65, 0.95). However, we note that out of 19 patients with a recurrence documented in their medical records, 13 had this recurrence documented in the DBCG registry. The completeness of recurrence registered in the DBCG was 70% (95%CI = 48%, 86%).

Discussion

Our study shows strong agreement between DBCG registry data and medical records for patient, tumor, and treatment variables recorded at diagnosis, findings that are consistent with previous validation studies [Citation3,Citation5,Citation14]. In addition, the longitudinal design of the parent ProBeCaRe study allowed evaluation of changes in menopausal status and endocrine therapy during follow-up [Citation15]. Our findings suggest that changes in endocrine therapy as documented in the DBCG registry are valid.

However, DBCG registry data on menopausal transition appears to be less valid and medical records, seem to be a poor gold standard for this variable. In the absence of a confirmatory blood test or oophorectomy, it may be difficult to distinguish menopausal transition from the transient side effects of chemotherapy or tamoxifen [Citation16]. Tamoxifen- or chemotherapy-induced transient amenorrhea can preclude the assessment of ovarian function [Citation17], leading to difficulty in determining the optimal time to switch from tamoxifen to aromatase inhibitors. Several medical records that we reviewed mentioned irregular menses during chemotherapy, with menstruation resuming after treatment, or amenorrhea during tamoxifen treatment, but a premenopausal status on blood tests. Given the potentially poor validity of menopausal transition in the DBCG registry, we recommend that future studies using DBCG registry data should model menopausal status as a baseline covariate. Modeling the type of endocrine therapy recorded in the DBCG registry as a time-varying variable is more defensible.

Breast cancer recurrence is the outcome in our longitudinal ProBeCaRe study. Our findings suggest that a record of breast cancer recurrence may be missing in the DBCG registry for up to 30% of patients, despite these patients being within 10 years of their primary breast cancer diagnosis. We also note that these patients did not have any evidence of a withdrawal from follow-up. Previous studies validating registry data for patients diagnosed 1989–2001 with follow-up through March 2007 indicate some underreporting [Citation5,Citation6], due to withdrawal from follow-up of patients who developed a new primary malignancy [Citation5] and, early termination of follow-up in patients perceived to have low recurrence risk – contrary to DBCG guidelines [Citation6]. Underreporting of recurrence has also been observed among patients aged over 70 years at diagnosis [Citation6]; although such patients are not included in our premenopausal cohort. The missing registration of recurrence for 30% of patients will impact estimates of the absolute risk of recurrence, so the risk difference of recurrence in ProBeCaRe will be downwardly biased by 30% [Citation15,Citation18]. However, this 30% is expected to be non-differential as we found no examples of false positive recurrences recorded in the registry. Therefore, ratio measures of association using recurrence as the outcome are likely to be unbiased [Citation19]. This expectation of no bias arises because the non-differential sensitivity reduces the numerator and the denominator of ratio estimates of association in equal proportion, and therefore cancels, so long as there are also no false-positive recurrences. We have included the mathematical equations to illustrate this bias, and an example to illustrate the various bias scenarios in the online Supplementary Appendix 1.

Several factors should be considered when interpreting our study. We restricted this validation study to patients diagnosed at three major hospitals. The characteristics of patients diagnosed at these hospitals are representative of the entire ProBeCaRe study cohort, supporting the generalizability of our findings to the parent study population. Nonetheless, aside from the expected differences due to the sampling criteria for the validation subcohort, we did observe some differences in baseline characteristics of the entire ProbeCaRe study cohort and the validation subcohort, including the distribution of comorbid disease and age. We believe that these differences are secondary to the stratified sampling. Although our validation study included fewer patients than in previous validation studies, we performed detailed medical record review to assess data quality for patient, tumor, treatment, and follow-up characteristics, some of which have not been validated previously. Many patients included in our study may still be undergoing active treatment and follow-up, so not all medical records were available from the hospital archives. Although misclassification of breast cancer recurrence affects the recurrence rate, it is unlikely to bias associations measured on the relative scale. The high PPVs for all variables except menopausal transition support the validity of the DBCG data.

Supplemental material

IONC_A_1327720_Supplementary_Information.zip

Download Zip (35.9 KB)

Acknowledgments

The authors thank the Danish Breast Cancer Group for preparation of the initial dataset, and Henriette Kristoffersen and Hanne M. Madsen for reviewing medical records.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The study was supported by grants from the US National Institutes of Health, National Cancer Institute (R01CA166825) (T.L.L.), the Program for Clinical Research Infrastructure (PROCRIN), established by the Lundbeck Foundation and the Novo Nordisk Foundation (H.T.S.), and the Lundbeck Foundation (R167-2013-15861) (D.C.F.), and Susan G. Komen for the Cure (CCR13264024) (T.P.A.). The funding agencies had no role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the article; or the decision to submit the article for publication.

References

  • Ehrenstein V, Antonsen S, Pedersen L. Existing data sources for clinical epidemiology: Aarhus University prescription database. Clin Epidemiol. 2010;2:273–279.
  • Blichert-Toft M, Christiansen P, Mouridsen HT. Danish Breast Cancer Cooperative Group – DBCG: history, organization, and status of scientific achievements at 30-year anniversary. Acta Oncol. 2008;47:497–505.
  • Moller S, Jensen MB, Ejlertsen B, et al. The clinical database and the treatment guidelines of the Danish Breast Cancer Cooperative Group (DBCG); its 30-years experience and future promise. Acta Oncol. 2008;47:506–524.
  • Danish Breast Cancer Cooperative Group. Kvalitetsindikatorrap-port for brystkræft 2006 og 2007 [Quality indicator report for breast cancer 2006-2007]; 2008.
  • Jensen AR, Storm HH, Moller S, et al. Validity and representativity in the Danish Breast Cancer Cooperative Group – a study on protocol allocation and data validity from one county to a multi-centre database. Acta Oncol. 2003;42:179–185.
  • Christiansen P, Al-Suliman N, Bjerre K, et al. Recurrence pattern and prognosis in low-risk breast cancer patients-data from the DBCG 89-A programme. Acta Oncol. 2008;47:691–703.
  • Schmidt M, Pedersen L, Sorensen HT. The Danish Civil Registration System as a tool in epidemiology. Eur J Epidemiol. 2014;29:541–549.
  • Schmidt M, Schmidt SA, Sandegaard JL, et al. The Danish National Patient Registry: a review of content, data quality, and research potential. Clin Epidemiol. 2015;7:449–490.
  • Charlson ME, Pompei P, Ales KL, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–383.
  • Lash TL, Fox MP, Thwin SS, et al. Using probabilistic corrections to account for abstractor agreement in medical record reviews. Am J Epidemiol. 2007;165:1454–1461.
  • Thwin SS, Clough-Gorr KM, McCarty MC, et al. Automated inter-rater reliability assessment and electronic data collection in a multi-center breast cancer study. BMC Med Res Methodol. 2007;7:23.
  • Silliman RA, Guadagnoli E, Rakowski W, et al. Adjuvant tamoxifen prescription in women 65 years and older with primary breast cancer. J Clin Oncol. 2002;20:2680–2688.
  • Roberts C. Modelling patterns of agreement for nominal scales. Stat Med. 2008;27:810–830.
  • Rostgaard K, Holst H, Mouridsen HT, et al. Do clinical databases render population-based cancer registers obsolete? The example of breast cancer in Denmark. Cancer Causes Control. 2000;11:669–674.
  • Rothman KJ, Greenland S, Lash TL, Modern epidemiology. 3rd ed. Philadelphia (PA): Lippincott Williams & Wilkins; 2008.
  • Pan K, Chlebowski RT. Adjuvant endocrine therapy of perimenopausal and recently postmenopausal women with hormone receptor-positive breast cancer. Clin Breast Cancer. 2014; 14:147–153.
  • Torino F, Barnabei A, De Vecchis L, et al. Recognizing menopause in women with amenorrhea induced by cytotoxic chemotherapy for endocrine-responsive early breast cancer. Endocr Relat Cancer. 2012;19:R21–R33.
  • Rodgers A, MacMahon S. Systematic underestimation of treatment effects as a result of diagnostic test inaccuracy: implications for the interpretation and design of thromboprophylaxis trials. Thromb Haemost. 1995;73:167–171.
  • Rothman KJ, Greenland S, Lash TL. Validity in epidemiologic studies. In: Rothman KJ, Greenland S, Lash TL, editors. Modern epidemiology. Philadelphia (PA): Lippincott Williams & Wilkins; 2008. p. 128–147.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.