2,985
Views
426
CrossRef citations to date
0
Altmetric
ORIGINAL RESEARCH

Identifying Individuals with Physcian Diagnosed COPD in Health Administrative Databases

, , MD, MSc, , MSc, , MD, , PhD, ACNP, CAE & , PhD
Pages 388-394 | Published online: 08 Oct 2009

Abstract

Chronic Obstructive Pulmonary Disease (COPD) is a common chronic respiratory disease responsible for significant morbidity and mortality. Population-based health administrative databases provide a powerful and unbiased way of studying COPD in the population, however, their ability to accurately identify patients with this disease must first be confirmed. The objective was to validate population-based health administrative definitions of COPD. Previously abstracted medical records of adults over the age of 35 randomly selected from primary care practices in Ontario, Canada were reviewed by an expert panel to establish if an individual did or did not have a diagnosis of COPD. These reference designations were then linked to each individual's respective health administrative database record and compared with predefine health administrative data definitions of COPD. Concepts of diagnostic test evaluation were used to calculate and compare their test characteristics. The most sensitive health administrative definition of COPD was 1 or more ambulatory claims and/or 1 or more hospitalizations for COPD that yielded a sensitivity of 85.0% (95% confidence interval 77.0 to 91.0) and a specificity of 78.4% (95% confidence interval 73.6 to 82.7). As number of ambulatory claims in the definition increased, sensitivity decreased and specificity increased. Individuals with COPD can be accurately identified in health administrative data, and therefore it may be used to create an unbiased population cohort for surveillance and research. This offers a powerful means of generating evidence to inform strategies that optimize the prevention and management of COPD.

INTRODUCTION

Chronic obstructive pulmonary disease (COPD) is a common, chronic respiratory condition that imposes a heavy and expensive burden on individuals and health care systems. Previous population work evaluating the burden of COPD has been based on national population based cross-sectional surveys and cohort studies in specific populations (Citation[1]). Although these estimates are useful, they are limited by their cross-sectional nature, their focus on specific populations, and/or their inability to compare estimates between smaller regional areas or patient subgroups. They have also generally excluded individuals living in institutions or on-reserves–two groups that would be considered to have a higher prevalence of COPD. Ideally, a prospective clinical disease registry of all affected individuals would be used to understand the burden of COPD; however, the resources and time that would be needed to find and enter the hundreds of thousands of people who have COPD would be considerable. Health administrative (HA) databases provide an unbiased and more practical way of surveying and studying COPD in the population which is relatively inexpensive and quick. This is particularly true in a large universal, single payer health care system where all health care claims for the population can be accessed from a few linkable databases. Analysis of HA data can be useful to health policy makers and clinical researchers to define incidence and prevalence of COPD, determine burden of disease, ascertain resource utilization, follow cohorts of individuals with COPD longitudinally, and develop and evaluate disease management strategies.

In order to create an accurate and comprehensive COPD surveillance and research program, health administrative population-based data that captures all hospital and emergency department admissions, physician visits, laboratory and diagnostic tests would be ideal. Such a program based on HA databases cannot succeed, however, without first confirming the validity of a COPD diagnosis within these databases. It cannot be presumed that HA data is always valid (Citation[2]). In Ontario, physicians provide diagnostic data, usually recorded at the time of patient encounter, and there are no incentives tied to accuracy. In addition, there is relatively little verification of data once it has been entered. Therefore, in order to validate the diagnosis of COPD in Ontario population-based HA databases—specifically to assess HA algorithms for case capture and to determine their measures of performance (sensitivity, specificity, predictive values)—a validation study was conducted.

MATERIALS AND METHODS

Study design

A chart verification study comparing HA definitions of COPD to clinical diagnoses of individuals with and without COPD was conducted.

Subjects

Chart abstractions of patients charts from primary care physician offices were obtained for the purpose of validating respiratory disease diagnoses in HA databases. Four hundred forty-two charts for adults 35 years of age and older were used in the current study. These charts were from patients with COPD (113 charts), asthma, other respiratory conditions and controls without respiratory disease as described in detail below. Patients with asthma and respiratory related conditions were included in order to ensure that a validated HA definition of COPD would be able to distinguish COPD from these other diseases that it clinically resembles. Patients with these conditions were also included because individuals with non-diagnosed clinically significant COPD were most likely to be found in these groups, thereby allowing such patients to be represented in the study cohort.

Chart abstractions

Study patients were randomly selected from primary care physician (PCP) practises which were, in turn, randomly selected from PCP practises across Ontario—a province in Canada with a population of about 12 million people. The sampling frame for the PCP practices included 5821 primary care practitioners identified in the Canadian Medical Directory Database who had treated 30 or more COPD and 30 or more asthma patients in 2003. This was done because recruiting PCP and travelling to their offices was a very resource intense process and we wanted to ensure sufficient patients to study. This frame was then stratified into eastern, central and western Ontario and equal numbers of PCPs were selected from each region. A letter was mailed and a follow-up phone call made to 81 PCPs inviting them to participate in the study. Forty responded, and of these 22 chose not to participate. A further 5 were excluded because they did not have an office based practise, did not have the time or space to accommodate a chart abstractor working in their office, or did not use an electronic claim submission and billing system. This left 13 PCPs who agreed and were eligible to participate.

In each PCP practice that participated, the electronic billing system was used to create a sampling frame that included all patients born between fiscal years 1925 and 1986 (ages 19 to 80 years old inclusive), who were currently residing in Ontario and who were seen by the PCP in the previous 5 years. The sampling frame was then stratified into four diagnostic categories according to the information from the PCP patient chart. The four diagnostic categories were COPD, asthma, respiratory-related, and non-respiratory conditions. The last category was limited to patients with hypertension or musculoskeletal problems (). Of note, it was possible for an individual to be eligible to be included in a COPD and a non-COPD category. In order to reflect clinical reality, these patients were not excluded and instead included in the most recent category that they were eligible.

Table 1 Ontario Health Insurance Plan codes used to identify COPD, asthma, respiratory related, and other categories

In each PCP practise, ten charts were randomly selected and abstracted from each of the four diagnosis categories by a trained abstractor. The patient's entire chart was reviewed and a standardized abstraction form that included a full respiratory history, smoking history, medical history, family history, concurrent medical conditions, medications prescribed, physical examination, pulmonary function studies, and radiographic examinations was completed. Also abstracted was the unique health insurance number given to all individuals in Ontario insured under its universal health insurance system. Upon completion of the abstractions, they were reviewed for completeness.

Sample size calculation

In order to obtain sensitivity and specificity estimates of approximately 85% with a precision that ensured that the 95% confidence intervals were within 10% of the estimate, it was determined that 110 individuals with COPD who made up 25% of the study population were needed (Citation[3], Citation[4]).

Determining Gold Standard diagnoses of COPD

An expert panel of two pulmonologists who were blinded to the diagnostic category but not to the objectives of the study, reviewed the patient chart abstractions independently to identify the cases as COPD or non-COPD. Where consensus could not be reached, a third pulmonologist was consulted as the “tie-breaker”. The consensus reached by the expert panel was considered the reference standard for COPD diagnosis. Although, without pulmonary function tests being available for every patient, this reference standard might not be considered the most rigorous possible, it surpasses the standard by which the majority of patients with COPD are diagnosed (i.e., by a lone PCP without pulmonary function tests) and therefore was considered acceptable (Citation[5]). It was felt to be particularly suitable because it was being used to validate a HA definition of physician diagnosed COPD for use in the general population where most COPD is diagnosed by PCPs (Citation[6]). Of note, to facilitate analysis, reviewers were asked to ‘force’ patients into COPD or non- COPD groups even though, in some situations, it might have been appropriate for them to have been in both.

COPD HA definitions

Fields from two HA databases were used to create COPD HA definitions to be evaluated against the reference standard diagnosis. The outpatient database used was the Ontario Health Insurance Plan (OHIP) physician services claims database which contains information on outpatient claims for all Ontario residents (including claims for physician visits, laboratory tests, and diagnostic imaging). Physicians are reimbursed through OHIP by submitting claims for medical services provided. One disease code is provided as part of a claim. The diagnostic codes used to identify COPD OHIP claims are described in . The second database was the Canadian Institute of Health Information (CIHI) discharge abstract database (DAD) which contains clinical and administrative data for every hospital visit in Canada including diagnostic codes from the International Classification of Diseases 10th Revision (ICD-10). CIHI collects 1 most responsible and 24 secondary diagnoses and if any one was for COPD, the hospitalization was considered a COPD hospitalization. Submission of accurate information to the CIHI DAD is mandatory for all hospitals in Canada in order to receive funding. Under its provincial health insurance, the province of Ontario only provides medications for individuals aged 65 and older. Therefore because there was not information on all medication use for all participants, it was not included as a component of the HA definitions.

Determining test characteristics of HA claim algorithms

Patient reference standard diagnoses were linked to their HA record via their unique health insurance number. Predefined COPD HA definitions were compared to the reference standard diagnoses and analyzed using the concepts of diagnostic test evaluation. Specifically, the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio, and negative likelihood ratio of the COPD HA definitions were determined.

RESULTS

Expert panel review

The 442 chart abstractions were given to the expert panel to determine the gold standard reference diagnosis. The overall inter-rater agreement for a COPD diagnosis between the two main reviewers ranged from a kappa of 0.36 to a kappa of 0.92 depending on site (). Of the 442 charts reviewed, there were 36 (7.0%) disagreements that were settled by a tie-breaker.

Table 2 Site-specific inter-rater agreement between panel reviewers for COPD diagnosis

Study cohort

There were 113 COPD and 329 non-COPD patients according to the gold standard diagnoses. Characteristics of the final study population are summarized in . Compared to non-COPD patients, COPD patients were older, more likely to be male, and more likely to be current or prior smokers.

Table 3 Demographic and COPD related characteristics for patients categorized as having COPD or non-COPD according to the expert panel reference standard diagnosis

Test characteristics of COPD HA algorithms

To determine if individuals with COPD could be accurately identified by HA data, different HA definitions were compared to the reference standard diagnoses. shows the diagnostic test evaluation statistics, including the sensitivity, specificity, PPV, NPV, positive likelihood ratio, and negative likelihood ratio of the HA definitions.

Table 4 Test characteristics of health administrative definitions of COPD using the expert panel as the reference standard

Secondary analysis

Accepting that the PCPs might have had relevant information about their patients' respiratory status that they may not write down in their charts because of lack of time in their busy medical practices, the same analysis was repeated using the PCP diagnosis instead of the expert panel diagnosis as the reference standard. According to the PCPs, 117 of the 442 patients had a diagnosis of COPD. Table 5 shows the diagnostic test characteristics of the predefined HA definitions compared to this secondary reference standard.

Ethics committee approval

Ethics committee approval was obtained through the Hospital for Sick Children, Toronto, Ontario, Canada.

DISCUSSION

In this validation study of HA algorithms to identify COPD, one or more physician claims and/or one or more hospitalizations for COPD in an individual over the age of 35 years was the HA COPD definition found to have the highest values of both sensitivity and specificity. This would be the best algorithm to identify individuals with COPD for surveillance to ensure that first, few people with COPD are missed (low false negatives), and second, few people with other diseases are incorrectly included. For research purposes or if only patients with definite COPD are desired, the algorithms of 2 or more physician claims and/or 1 or more hospitalization in 2 or 3 years offers superior specificity (low false positives) at the expense of sensitivity. Thus, the purpose of the algorithm will drive the definition to be used.

The results of this study can be used to construct a cohort of all individuals with COPD in Ontario. This inclusive cohort could be used for COPD surveillance, to determine natural history of COPD, and/or to study COPD comorbidity. It can also be used for research, for example, to determine post-marketing COPD medication safety. Whatever the purpose, one needs to consider the false positives and negatives of the HA algorithm used and how they affect the results.

Two previous studies have examined the validity of using one ambulatory care COPD claim to diagnose COPD in Canadian provincial HA databases. Both were conducted at specific medical clinics in Quebec. Similar to the current study, one concluded that the COPD could be identified in HA databases reliably and comprehensively (Citation[3]). The other concluded that the HA diagnosis of COPD lacked validity largely because of a high number of false positives (Citation[2]). Compared to this later study, the current study used fewer types of OHIP diagnostic codes to identify COPD in HA data and, although a number of false positives were identified, resulted in more favourable test characteristics. There has been one previous study that validated COPD in Canadian hospital databases where, depending on the diagnostic code examined, an agreement of 68 to 83% was found between hospital discharge data and the claim records (Citation[7]). The current study extends on the findings of these previous studies by validating the diagnosis of physician diagnosed COPD using ambulatory care and hospitalization claims in the general population of Ontario. It is also the first to examine and compare the validity of different HA claim definitions to identify COPD.

COPD should be diagnosed by pulmonary function tests and this study shows—as others have before it has—that spirometry use in the community is low (Citation[6]). This is why expert opinion, and not PCP diagnosis, was chosen as the reference standard. Lack of spirometry for some of the patients was a consideration in their decisions. Ideally all the patients included would have received spirometry but, in practice, if we had only studied patients who had received spirometry it would have introduced bias into our population-based sample. Clinical evaluation without spirometry has been found to be specific for COPD, but not highly sensitive and is therefore likely to miss milder cases (Citation[8]). The HA algorithm validated may likewise might miss milder COPD. Nonetheless, it will identify individuals with clinically significant COPD who are responsible for most of the burden of COPD and demand the most health care resources. Such a group is of interest to health care providers, researchers, and policy makers wishing to understand the real world burden of COPD and the effectiveness (as opposed to efficacy) of management strategies used to treat it.

The strengths of this study were its sampling frame consisting of patients from different parts of the province, its generalizability, its use of comparison groups of conditions that resemble COPD, and its ability to compare clinical diagnoses to HA algorithms to determine validity.

The main limitation of this study was that, because a strict population sample was not used, the estimated of test characteristics were only approximations of the true population test characteristics. There were a few practical aspects of study design that might have biased our results. For example, using a study sample population heavily weighted towards conditions that resembled COPD may have led to an underestimate of the test characteristics. In addition, only allowing patients who may have been eligible to be in both the COPD and non-COPD categories into only one of these categories, may have also led to an underestimate of the true test characteristics. Alternatively, including a larger proportion of individuals with COPD than seen in the general population may have led to an overestimate of the test characteristics. Finally, focusing on physicians who had seen at least 30 or more patients with respiratory disease could have also introduced bias although it uncertain in which direction. Such practical compromises, like the ones described above, are often made in studies evaluating diagnostic tests (10,11). Because they apply to all the HA algorithms considered, however, they would not influence how they perform relative to each other.

In summary, we performed a chart abstraction study to validate HA algorithms that accurately identify individuals with COPD in Ontario HA databases and found that, for individuals over the age of 35, one or more ambulatory care and/or hospitalization claims was both sensitive and specific in identifying individuals with COPD. This HA algorithm should be re-validated every few years or whenever there is a significant change in practice or health care system billing to ensure that it remains accurate. Thus, such information may be used to construct COPD cohorts for surveillance and research and offers a powerful way to produce ‘real world’ evidence to guide clinical care and improve outcomes for people living with COPD.

Declaration of interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

Funding for this project was made available through the Ontario Ministry of Health and Long-Term Care. Dr. Gershon is supported by a research fellowship from the Canadian Institute of Health Research, Institute of Population and Public Health and The Public Health Agency of Canada. Dr. To is supported by The Dales Award in Medical Research from the University of Toronto, Toronto, Ontario, Canada. This study was supported by the Institute for Clinical Evaluative Sciences (ICES), which is funded by an annual grant from the Ontario Ministry of Health and Long-Term Care (MOHLTC). The opinions, results and conclusions reported in this paper are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred.

REFERENCES

  • Halbert R J, Natoli J L, Gano A, Badamgarav E, Buist A S, Mannino D M. Global burden of COPD: systematic review and meta-analysis. Eur Respir J 2006; 28(3)523–532
  • Lacasse Y, Montori V M, Lanthier C, Maltis F. The validity of diagnosing chronic obstructive pulmonary disease from a large administrative database. Can Respir J 2005; 12(5)251–256
  • McKnight J, Scott A, Menzies D, Bourbeau J, Blais L, Lemiere C. A cohort study showed that health insurance databases were accurate to distinguish chronic obstructive pulmonary disease from asthma and classify disease severity. J Clin Epidemiol 2005; 58(2)206–208
  • Carley S, Dosman S, Jones S R, Harrison M. Simple nomograms to calculate sample size in diagnostic studies. Emerg Med J 2005; 22(3)180–181
  • O'Donnell D E, Aaron S, Bourbeau J, Hernandez P, Marciniuk D D, Balter M, et al. Canadian Thoracic Society recommendations for management of chronic obstructive pulmonary disease—2007 update. Can Respir J 2007; 14(Suppl B)5B–32B
  • Jaakkimainen L, Klein-Geltink J E, Guttmann A, Barnsley J, Zagorski B M, Kopp A, . Indicators of Primary Care Based on Health Administrative Data, L Jaakkimainen, R E Upshur, J E Klein-Geltink, S Maaten, S E Schultz, A Leong, , et al. Institute of Clinical Evaluative Sciences, Toronto 2006, Primary Care in Ontario: ICES Atlas. Ref Type: Report
  • Rawson N S, Malcolm E. Validity of the recording of ischaemic heart disease and chronic obstructive pulmonary disease in the Saskatchewan health care datafiles. Stat Med 1995; 14(24)2627–2643
  • Qaseem A, Snow V, Shekelle P, Sherif K, Wilt T J, Weinberger S, et al. Diagnosis and management of stable chronic obstructive pulmonary disease: a clinical practice guideline from the American College of Physicians. Ann Intern Med 2007; 147(9)633–638

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.