Full article: A new staging strategy for chronic obstructive pulmonary disease

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Background

The best method for expressing lung function impairment is undecided. We tested in a population of patients with chronic obstructive pulmonary disease (COPD) whether forced expiratory volume in 1 second (FEV₁) or FEV₁ divided by height squared (FEV₁/ht²) was better than FEV₁ percent predicted (FEV₁PP) for predicting survival.

Method

FEV₁, FEV₁PP, and FEV₁/ht² recorded post bronchodilator were compared as predictors of survival in 1095 COPD patients followed for 15 years. A staging system for severity of COPD was defined from FEV₁/ht² and compared with the Global Initiative for Obstructive Lung Disease (GOLD) staging system.

Result

FEV₁/ht² was a better univariate predictor of survival in COPD than FEV₁ and both were better than FEV₁PP. The best multivariate model for predicting survival included FEV₁/ht², age and sex. Comparing the GOLD stages with the FEV₁/ht² groups found that survival was more coherent within each FEV₁/ht group than it was within each GOLD stage. FEV₁/ht² had 60% more people in its most severe group than the severest GOLD stage with these extra subjects having equivalently poor survival and had 155% more in the least severe group with equivalent survival. GOLD staging misclassified 51% of subjects with regard to survival.

Conclusion

We conclude that GOLD criteria using FEV₁PP do not optimally stage COPD with regard to survival. An alternative strategy using FEV₁/ht² improves the staging of this disease. Studies which stratify COPD patients to determine the effect of interventions such as drug trials, rehabilitation, or management guidelines should consider alternatives to the GOLD classification.

Keywords:

Introduction

Lung function measurements are made to help determine if a subject has an abnormality of lung function and also to monitor a disease or its response to treatment. The degree of any abnormality may be needed to determine choices about therapy or level of disability benefit but may also be needed to estimate prognosis. To determine if a lung function result is abnormal the American Thoracic Society (ATS) and European Respiratory Society (ERS) recommend (CitationQuanjer et al 1993; CitationATS 1994) that the individual’s lung function data should be compared with a predicted value using the method of standardized residuals (SR). SR are calculated from: $(Recorded {FEV}_{1} - Predicted {FEV}_{1}) / RSD$ where RSD is the residual standard deviation from the regression equation used for the prediction, so that an FEV₁SR of −1.645 is at the lower 90% confidence limit of normality.

When it comes to expressing the degree of any such abnormality, that is the severity of impairment, there is no agreed method that gives equivalent results for all ages and both sexes. The method of percent of predicted (PP) is widely used for expressing disease severity (CitationATS 1987) but its validity is not based on statistical evidence (CitationSobol and Weinheimer 1966; CitationMiller and Pincock 1988). SR are potentially flawed for expressing severity in that a young subject’s predicted value is much higher than that for an older subject, so a young subject can have a forced expiratory volume in 1 second (FEV₁) 5 SR below predicted which for an older subject would never be found as the FEV₁ would have to be below zero.

The PP method has been used in many epidemiological studies (CitationLange et al 1990; CitationKnuiman et al 1999; CitationSchunemann et al 2000; CitationThomason and Strachan 2000) showing that FEV₁ expressed as percent predicted (FEV₁PP) is an important predictor of survival. However, in chronic obstructive pulmonary disease (COPD) it was raw FEV₁ that was originally shown to relate to survival (CitationBurrows and Earle 1969). The Framingham study found that FEV₁ divided by height squared was an important predictor of survival (CitationAshley et al 1975) in the general population, a finding recently reinforced from the Reykjavik study (CitationChinn et al 2007). More recently the PP method has been used with arbitrary threshold values in guidelines with regard to specifying the severity of COPD (CitationATS 1995; CitationSiafakas et al 1995) and asthma (CitationBritish guideline 2005) and the Global Initiative for Chronic Obstructive Lung disease (GOLD) has empirically defined 5 stages of severity (CitationNIH et al 2005) based on FEV₁PP. Concern has been raised that this system is not evidence-based (CitationKerstjens 2004).

Prediction equations such as those recommended by the ERS (CitationQuanjer et al 1993) are used to derive a predicted value for an individual which tries to take into account the effect of age, height, and sex on lung function, but this is imprecise with wide confidence intervals and the data from which the equations are derived may not perfectly match the patient under consideration. Therefore any method of relating lung function to a predicted value may introduce some age, height or sex related error. We have therefore tested whether lung function impairment expressed as absolute FEV₁ or standardized by height might be better than FEV₁PP and FEV₁SR in predicting survival in COPD.

Methods

We have reanalyzed the data from 1095 subjects with COPD who had been recruited from 1983 to 1988 in a respiratory clinic in Copenhagen into a longitudinal study to look at predictors of their all cause mortality (CitationHansen et al 1999, Citation2001) over a 15 year follow up. They were all assessed by a consultant respiratory physician and diagnosed as having COPD based from symptoms of cough, sputum, lack of variability in airflow limitation, lack of allergy and atopy, their smoking history and their FEV₁ as a percent of forced vital capacity (FVC) had to be less than 89% of their predicted value, in accordance with ERS recommendations at that time (CitationSiafakas et al 1995). There was no within day variation in lung function and no significant acute or corticosteroid reversibility to their airflow obstruction. We have analyzed the spirometric data on entry into the study which were obtained after both an inhaled bronchodilator and a 2 week trial of oral corticosteroids, ie, maximally bronchodilated as per GOLD recommendation (CitationNIH et al 2005).

Each subject’s data had their FEV₁ expressed as percent predicted (PP) using ERS prediction equations (CitationQuanjer et al 1993), as FEV₁ divided by height squared (FEV₁/ht²) and as standardized residuals (SR) (CitationMiller et al 1985; CitationQuanjer et al 1993). Subjects were staged by GOLD criteria (CitationNIH et al 2005) and also arbitrarily by FEV₁/ht² into 4 stages as defined in . Because GOLD stage 1 only identified 6 subjects in this cohort the staging by FEV₁/ht² was into 4 and not 5 arbitrary groups in order to allow fairer comparison between the two classification systems.

Table 1 Classification criteria for GOLD and FEV₁/ht² staging from spirometry alone. FEV₁% is the FEV₁ expressed as a percentage of FVC. FEV₁PP is the FEV₁ expressed as a percentage of the subject’s predicted value

Download CSV Display Table

Statistical analysis was undertaken using SPSS for Windows version 11.5 (Microsoft, Redmond, WA). Cox proportional hazards regression analysis was used for predicting all cause mortality using quintiles of putative predictor variables. This method is a special form of multiple regression that allows for the inclusion of the censored data for those cases for which the end point of death has not yet occurred. The best prediction models were judged on the basis of highest global chi-square (χ²) value and greatest reduction in −2 times the log likelihood (−2LL) for different models (CitationCox 1972). The proportional hazards model assumes that the ratio of the estimated hazard functions for any two observations with different values for the independent (predictor) variables is constant over time. This assumption was checked for each model from plots of cumulative hazard against survival time stratified by the relevant FEV₁ index. For the multivariate Cox models, hazard ratios were calculated by contrasting risk against that for the first stage of each index. Subjects were categorized by their GOLD and FEV₁/ht² stages and Kaplan-Meier survival analysis was used to estimate mean survival for every subgroup of subjects defined by the 4 × 4 matrix of GOLD and FEV₁/ht² stages. Mean survival was used rather than median survival since some subgroups were too small for reliable estimates of the median.

Results

Age, height, and spirometric values are shown in for the 1095 subjects of whom 644 (59%) were female. The median survival from entry into the longitudinal study was 3208 days, with 723 (66%) dying during the period of follow up (median survival 2231 days versus 4083 days for survivors), of whom 404 were female which was slightly lower than the number (425) expected by chance (χ² = 7.57, df = 1, p = 0.006). Univariate Cox proportional hazards models for survival were derived with the following variables as predictors: height, sex, body mass index (BMI), FEV₁, FEV₁/ht², FEV₁PP and FEV₁SR. shows the strength of univariate association with survival for each index as a continuous variable with FEV₁/ht² being the best univariate predictor.

Table 2 Descriptive data for age, height and spirometric values for the 1095 patients in the study (644 female) showing mean, standard deviation (SD) and median values

Download CSV Display Table

Table 3 Univariate analysis: Indices that were found by Cox regression to be univariate predictors of survival with χ² value for strength of prediction and its significance value

Download CSV Display Table

To allow hazard ratios for various predictors to be calculated Cox regression models were constructed with sex and quintiles of age and the various lung function indices. The best multivariate model was with FEV₁/ht², age and sex (χ² = 289, −2LL = 9025) and the hazard ratios for the predictors are shown in . The other FEV₁ indices gave less good models with the next best being with raw FEV₁, followed by FEV₁PP and then FEV₁SR. The model with FEV₁/ht² had a lower HR for male sex than that with raw FEV₁ suggesting the standardization by height had taken into account some sex difference. The models with PP and SR found the HR for male sex was no longer significant. BMI quintiles improved each of these models slightly but only the lightest BMI quintile had a significantly increased hazard ratio (HR = 1.4, 95% CL 1.1 to 1.8) compared with the other BMI quintiles.

Table 4 Multivariate analysis: The table shows the covariate hazard ratios (HR) for death (with 95% confidence limits) for Cox proportional hazards models using sex and the quintiles of age and various methods for expressing FEV₁ impairment. The HR values are contrasts against the hazard estimated for the first stage of each group and for male contrasted with female. Chi-square values are shown for each model

Download CSV Display Table

shows mean survival (with 95% CL bars) for each FEV₁/ht² group compared with the GOLD stage. shows the numbers of subjects in each of the staging groups and the full data used in these Figures is presented in . The mean survival for the FEV₁/ht² groups decreases progressively and fairly evenly whereas the survival for GOLD 0 and 2 are very similar. There were 308 subjects in FEV₁/ht² stage 1 which was 2.5 times the number of subjects in the equivalent GOLD stage 0 (121). More than half of the subjects in each of GOLD stages 2 and 3 were in discordant FEV₁/ht stages and the worst FEV₁/ht² stage included 1.6 times as many subjects as did GOLD stage 4 (199 versus 126). Overall 51% of subjects were in discordant staging groups. The poor staging by GOLD was confirmed from a Cox regression model using just the GOLD stage as predictor for survival, where the χ² value was only 81, which was significantly inferior to a model using just the FEV₁/ht² stage which yielded χ² value of 166.

Table 5 Cross tabulation of concordance in the 1095 subjects between GOLD classification (GOLD stages 0 to 4) and FEV₁/ht² classification for COPD severity. The table shows the number of subjects dying (nd) under follow up as a fraction of the total number N within that group (nd/N), the estimated Kaplan-Meier mean survival for that group as years in bold and the 95% confidence limits for the estimate in parenthesis. Aggregate results for each columns and row as a whole are also shown

Download CSV Display Table

Figure 1 The mean survival for each FEV₁/ht² group and each GOLD stage with 95% confidence limit bars and the number of subjects in each column.

Figure 1 The mean survival for each FEV1/ht2 group and each GOLD stage with 95% confidence limit bars and the number of subjects in each column.

Figure 2 Plot of the number of subjects in each of the groups for the FEV₁/ht² staging system on the left and the GOLD staging system on the right with the mean survival of each group stated inside the relevant column.

Figure 2 Plot of the number of subjects in each of the groups for the FEV1/ht2 staging system on the left and the GOLD staging system on the right with the mean survival of each group stated inside the relevant column.

Discussion

We present the first evidence that FEV₁PP is not as good as either raw FEV₁ or FEV₁/ht² at predicting survival in a cohort of COPD patients. This finding confirms concerns previously expressed about the use of PP as a method for expressing impairment in FEV₁ (CitationSobol and Weinheimer 1966; CitationMiller and Pincock 1988). The assumption that a fixed level of FEV₁PP means the same level of impairment for different subjects is clearly flawed since, for example, young adults with cystic fibrosis can survive with an FEV₁ just as low in absolute terms as old subjects, with their PP values going as low as 10% (CitationSood et al 2001) which is much lower than that seen in the elderly.

Several studies have shown that survival in COPD relates to FEV₁ (CitationBurrows and Earle 1969; CitationGorecka et al 1997; CitationThomason and Strachan 2000) with values lower than 0.75 litre having a 3 year survival of only 50% (CitationBurrows and Earle 1969). When using the GOLD criteria to stratify our subjects there were only 11.5% subjects in stage 4 whereas the method based on FEV₁/ht² increased the proportion in the severest stage to 18.2% and yet these had a comparably poor survival. This suggests that estimating severity of COPD from FEV₁PP criteria does not identify all those subjects who are most severely affected. Concern has been expressed that the GOLD criteria, which were arbitrarily chosen to define severity of COPD, are not based on any evidence of their ability to predict survival or any other aspect of COPD management (CitationKerstjens 2004). Our data provide strong evidence that the GOLD criteria are not adequate for correctly placing individual patients into disease stage categories that relate to survival. We do not have data on individual symptoms to determine how the two classification systems of COPD relate to symptoms and performance but this could be verified in other data sets.

Furthermore we found that within the GOLD stages there were subgroups with differing survival that were better stratified by FEV₁/ht ² staging. The definition of stage 4 by GOLD criteria severity (CitationNIH et al 2005) can include subjects with less severe spirometry results but who are in type 2 respiratory failure. Therefore it is possible that some of our subjects allocated to GOLD 3 might in fact be in GOLD stage 4 if their arterial gas results were known. However, it seems unlikely that the 85 subjects in the worst FEV₁/ht² stage but who we classified as GOLD 3 had been in type 2 respiratory failure on entry into the study because their mean survival was 6.1 years. It seems clear that FEV₁/ht² can stratify subjects with COPD into survival groups better than GOLD criteria without recourse to other data such as arterial gases.

Recent work has emphasized that multivariate analysis including data on BMI, the degree of airflow obstruction as FEV₁PP, dyspnoea, and exercise capacity measured by the six-minute-walk test (BODE index) was better at predicting survival in COPD than univariate analysis (CitationCelli et al 2004). However, our data suggest that the use of FEV₁/ht² instead of FEV₁PP as the index of airflow obstruction in this form of multivariate analysis would further improve the survival prediction.

Additional problems with the PP and SR methods for expressing degree of abnormality occur because they require an estimate of a subject’s predicted value. The regression equations recommended by the ERS for FEV₁ only account for about 58% of the variation of FEV₁ in normal subjects, thus the predicted value is not a precise estimate and will include errors related to sex, age, height, and technical issues (CitationQuanjer et al 1993). Furthermore if the subject whose predicted value is being derived is from a population different from that used to derive the prediction equation or the subject’s age is outside the limits of the population used in the equation then further errors will be incurred.

Our data show FEV₁/ht² is better than both raw FEV₁ and FEV₁PP for expressing degree of lung function impairment. If this dataset is analyzed with the two sexes considered independently then FEV₁/ht² gives results indistinguishable from that for raw FEV₁ in terms of ability to predict all cause mortality. However, in clinical practice it is advantageous to have a method for assessing risk that is applicable for both sexes in an equivalent way. FEV₁/ht² is one way of taking some of the sex differences in lung function into account without introducing the potential errors inherent in trying to predict what a given subject’s FEV₁ should be (CitationQuanjer et al 1993). A recent study of all cause mortality in a general population confirmed that FEV₁/ht² was better related to survival than FEV₁PP (CitationChinn et al 2007) and we now confirm this finding in a cohort of COPD patients.

We conclude that FEV₁/ht² is better related to survival in COPD than FEV₁PP and so may be the best method for expressing degree of FEV₁ impairment. Our findings indicate that applying the GOLD criteria to COPD management does not optimally classify those most severely affected. So studies which need to stratify COPD patients in order to determine the potential benefits of interventions such as drug trials, rehabilitation programmes or even discharge policies should not just consider the GOLD classification but also consider using FEV₁/ht² or other strategies to assess disease severity correctly. Our evidence suggests using GOLD criteria alone will have suboptimal power to show any benefits from such interventions. We believe future scientific endeavor in COPD must not be limited by the arbitrary GOLD staging which here failed to identify over one third of the worst prognosis group. Alternative classifications should now be prospectively compared so that the best results for managing and researching into COPD can be achieved.

References

[ATS] American Thoracic Society1987Standardization of spirometry: 1987 updateAm Rev Respir Dis136128696
Google Scholar
[ATS] American Thoracic Society1995Standardization of spirometry. 1994 updateAm J Respir Crit Care Med1521107367663792
PubMed Web of Science ®Google Scholar
[ATS] American Thoracic Society1995COPD GuidelinesAm J Respir Crit Care Med152S77S1207582322
PubMed Web of Science ®Google Scholar
AshleyFKannelWBSorliePD1975Pulmonary function: relation to aging, cigarette habit, and mortality. The Framingham StudyAnn Intern Med82739451094879
PubMed Web of Science ®Google Scholar
British Guideline2003British guideline on the management of asthmaThorax58Suppl 1
Google Scholar
BurrowsBEarleRH1969Course and prognosis of chronic obstructive lung disease. A prospective study of 200 patientsN Engl J Med2803974045763088
PubMed Web of Science ®Google Scholar
BurrowsBEarleRH1969Prediction of survival in patients with chronic airway obstructionAm Rev Respir Dis99865715787601
PubMed Web of Science ®Google Scholar
CelliBRCoteCGMarinJM2004The body-mass index, airflow obstruction, dyspnea, and exercise capacity index in chronic obstructive pulmonary diseaseN Engl J Med35010051214999112
PubMed Web of Science ®Google Scholar
ChinnSGislasonCAspelundT2007Optimum expression of adult lung function based on all-cause mortality: results from the Reykjavik studyRespir Med101601916889951
PubMed Web of Science ®Google Scholar
CoxDR1972Regression models and life tablesJ Royal Stat SocB34187220
Google Scholar
GoreckaDGorzelakKSliwinskiP1997Effect of long term oxygen therapy on survival in patients with chronic obstructive pulmonary disease with moderate hypoxaemiaThorax5266789337821
PubMed Web of Science ®Google Scholar
HansenEFPhanarethKLaursenLC1999Reversible and irreversible airflow obstruction as predictor of overall mortality in asthma and chronic obstructive pulmonary diseaseAm J Respir Crit Care Med15912677110194175
PubMed Web of Science ®Google Scholar
HansenEFVestboJPhanarethK2001Peak flow as predictor of overall mortality in asthma and chronic obstructive pulmonary diseaseAm J Respir Crit Care Med163690311254525
PubMed Web of Science ®Google Scholar
KerstjensHAM2004The GOLD classification has not advanced understanding of COPDAm J Respir Crit Care Med1702121315280172
PubMed Web of Science ®Google Scholar
KnuimanMWJamesALDivitiniML1999Lung function, respiratory symptoms, and mortality: results from the Busselton Health StudyAnn Epidemiol929730610976856
PubMed Web of Science ®Google Scholar
LangePNyboeJAppleyardM1990Spirometric findings and mortality in never-smokersJ Clin Epidemiol43867732213076
PubMed Web of Science ®Google Scholar
MillerMRPincockACGroveDM1985Patterns of spirogram abnormality in individual smokersAm Rev Respir Dis1321034404062035
PubMed Web of Science ®Google Scholar
MillerMRPincockAC1988Predicted values: how should we use them?Thorax4326573406912
PubMed Web of Science ®Google Scholar
[NIH] National Institutes of Health, National Heart, Lung and Blood Institute2005Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease NHLBI/WHO Workshop Report. Update 2005 [online]. Accessed on May 22, 2007. URL: http://www.goldcopd.com
Google Scholar
QuanjerPhHTammelingGJCotesJE1993Standardized lung function testing. Lung volumes and forced ventilatory flowsEur Respir J6Suppl 165408381090
PubMed Web of Science ®Google Scholar
SchunemannHJDornJGrantBJ2000Pulmonary function is a long-term predictor of mortality in the general population: 29-year follow-up of the Buffalo Health StudyChest1186566410988186
PubMed Web of Science ®Google Scholar
SiafakasNMVermeirePPrideNB1995ERS – Consensus statement. Optimal assessment and management of chronic obstructive pulmonary disease (COPD)Eur Respir J813984207489808
PubMed Web of Science ®Google Scholar
SobolBJWeinheimerA1966Assessment of ventilatory abnormality in the asymptomatic subject: an exercise in futilityThorax2144595969245
PubMed Web of Science ®Google Scholar
SoodNParadowskiLJYankaskasJR2001Outcomes of intensive care unit care in adults with cystic fibrosisAm J Respir Crit Care Med163335811179102
PubMed Web of Science ®Google Scholar
ThomasonMJStrachanDP2000Which spirometric indices best predict subsequent death from chronic obstructive pulmonary disease?Thorax55785810950899
PubMed Web of Science ®Google Scholar

A new staging strategy for chronic obstructive pulmonary disease

Abstract

Background

Method

Result

Conclusion

Introduction

Methods

Table 1 Classification criteria for GOLD and FEV₁/ht² staging from spirometry alone. FEV₁% is the FEV₁ expressed as a percentage of FVC. FEV₁PP is the FEV₁ expressed as a percentage of the subject’s predicted value

Results

Table 2 Descriptive data for age, height and spirometric values for the 1095 patients in the study (644 female) showing mean, standard deviation (SD) and median values

Table 3 Univariate analysis: Indices that were found by Cox regression to be univariate predictors of survival with χ² value for strength of prediction and its significance value

Discussion

References

Information for

Open access

Opportunities

Help and information

A new staging strategy for chronic obstructive pulmonary disease

Abstract

Background

Method

Result

Conclusion

Introduction

Methods

Table 1 Classification criteria for GOLD and FEV1/ht2 staging from spirometry alone. FEV1% is the FEV1 expressed as a percentage of FVC. FEV1PP is the FEV1 expressed as a percentage of the subject’s predicted value

Results

Table 2 Descriptive data for age, height and spirometric values for the 1095 patients in the study (644 female) showing mean, standard deviation (SD) and median values

Table 3 Univariate analysis: Indices that were found by Cox regression to be univariate predictors of survival with χ2 value for strength of prediction and its significance value

Discussion

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1 Classification criteria for GOLD and FEV₁/ht² staging from spirometry alone. FEV₁% is the FEV₁ expressed as a percentage of FVC. FEV₁PP is the FEV₁ expressed as a percentage of the subject’s predicted value

Table 3 Univariate analysis: Indices that were found by Cox regression to be univariate predictors of survival with χ² value for strength of prediction and its significance value