Health State Utility Value in Chronic Obstructive Pulmonary Disease (COPD); The Challenge of Heterogeneity: A Systematic Review and Meta-Analysis

Chronic obstructive pulmonary disease (COPD) has a considerable impact on quality of life and well-being of patients. Health state utility value (HSUV) is a recognized measure for health economic appraisals and is extensively used as an indicator for decision-making studies. This study is a systematic review of literature aimed to estimate mean utility value in COPD using meta-analysis and explore degree of heterogeneity in the utility values across a variety of clinical and study characteristic. The literature review covers studies that used EQ-5D to estimate utility value for patient level research in COPD. Studies that reported utility values elicited by EQ-5D in COPD patients were selected for random-effect meta-analysis addressing inter-study heterogeneity and subgroup analyses. Thirty-two studies were included in the general utility meta-analysis. The estimated general utility value was 0.673 (95% CI 0.653 to 0.693). Meta-analyses of COPD stages utility values showed influence of airway obstruction on utility value. The utility values ranged from 0.820 (95% CI 0.767 to 0.872) for stage I to 0.624 (95% CI 0.571 to 0.677) for stage IV. There was substantial heterogeneity in utility values: I2 = 97.7%. A more accurate measurement of utility values in COPD is needed to refine valid and generalizable scores of HSUV. Given the limited success of the factors studied to reduce heterogeneity, an approach needs to be developed how best to use mean utility values for COPD in health economic evaluation.


Quality of life can be defined as an individual's perception of their position in life or life satisfaction. It is a complex entity incorporating physical health, psychological condition, independent living, social relationships and personal judgement (Citation1). Health status, functional status, well-being, quality of life (QoL), health related quality of life (HR-QoL) and health state utility value (HSUV) are used interchangeably, but despite some differences in meaning, all these concepts are classified as patient-reported outcomes (PROs) (Citation2). In clinical practice, HSUV instruments are used to design clinical management guidelines, prioritizing patient complaints, screening possible problems and making decisions about treatment modalities.

Presently, Quality Adjusted Life Years (QALYs) are commonly applied as a measure of health in economic appraisals and are extensively used as outcomes for resource allocation decisions. Cost effectiveness of medical intervention in Chronic Obstructive Pulmonary Disease (COPD) utilizes generic (such as EQ-5D, SF-36) (Citation3, 4) or diseases-specific measures of QoL [such as St. George Respiratory Questionnaire (SGRQ) and Clinical COPD Questionnaire (CCQ)] (Citation5, 6).

Generic instruments such as EQ-5D have the advantage of having value-sets which facilitate the quantification of patient rated health status into measures of utility. These health-state utility reflects not only the presence, frequency or intensity of symptoms, abilities, or feeling as measured by psychometric instruments (Citation7) but also represents a social or individual's preferred value or judgment for specific health states relative to full health (Citation8, 9). The EQ-5D is the most widely used generic measure across all diseases. To convert patient responses to the health descriptors used in the scale to a single index of HSUV, a preference-based set of weights is applied. These descriptors comprise five dimensions (mobility, self-care, usual activities, pain/discomfort and anxiety/depression). In EQ-5D-5L (version 2005), each dimension has five levels: no problems, slight problems, moderate problems, severe problems, and extreme problems. In addition to the descriptive system, the EQ-5D contains a 25-cm vertical visual analogue scale (EQ VAS) that records the respondent's self-rated health, and can be used as a quantitative measure of health outcome. Based on societal preferences for health states, country-specific algorithms or tariffs have been generated (Citation10, 11). The minimally important clinical difference for the EQ-5D Index has been estimated to be: ±0.074 (12).

Overviews and meta-analyses of utility-based quality of life have been undertaken in a variety of diseases including diabetes (Citation13), various types of cancer (Citation7, 14, 15), HIV/AIDS (Citation16), chronic kidney disease (Citation17), neuropathic pain (Citation18) and orthopaedic diseases (Citation19). The main purposes of these reviews were to examine the applicability of these utility measures in patients with the diseases and to attempt to summarize mean utility scores according to the disease states.

Utility-based health-related quality of life in patients with COPD (necessarily together with their common co-morbidities) has been measured using surveys of COPD patients, but values differ significantly across studies. For instance, the reported average utility values for stage II COPD range from 0.579 (Citation20) to 0.929 (Citation21). Different methods of utility elicitation measures explain part of this variability. A recent study (Citation9) examining the role for meta-analysis for utility values has noted that combining reported utilities can be problematic, due to for example valuation methods and have recommended only combining studies reporting utility values that are derived in a similar fashion (e.g., using the same generic quality of life instrument). For this reason we confine our review to studies that employ the EQ-5D to measure utility values for COPD patients. While this may reduce some variation, the diversity in COPD patient population characteristics may also have other imposed effects on the value of utility measured in different studies.

The first aim of this study is to conduct a meta-analysis using EQ-5D, the most widely used instrument to determine mean utility scores for COPD. The second aim of this study is to explore degree of heterogeneity in the mean utility values across a variety of clinical and study characteristics.


Study selection

The literature review of HSUV studies in COPD comprises studies that use EQ-5D to estimate utility values for patient level research in COPD; simulation-based studies were not included.

Studies with the following criteria were included:

  • studies on health utility that were published prior to July 2015;

  • studies in which their sample population was specifically categorized as COPD as defined by standard criteria for COPD diagnosis and spirometric confirmation (should clearly be addressed in methodology of included studies);

  • English language studies and non-English language studies with English abstracts;

  • abstracts (e.g., seminar abstracts) and reports if adequate data for analysis were provided.

  • studies with more than 10 participants

Exclusion was applied for the following criteria:

  • editorials /opinion pieces, letters, systematic reviews and meta-analyses;

  • studies that reported utilities from proxies, not individual participant data (e.g., reported by family member or a health professional);

  • studies that obtained utility estimates from the literature, if there was not enough information on the derivation of utility;

  • studies that did not distinguish COPD from other types of obstructive pulmonary disease such as asthma or cystic fibrosis;

  • papers using utility values mapped from other reported Quality of Life studies;

  • Studies that reported utility values from non-stable and exacerbation state COPD patients.

Studies with different epidemiological designs (i.e., case control, randomized control trial (RCT), cohort, etc.) were included. It is not always feasible to conduct utility data collection within a clinical trial, so utility data from non-clinical trial studies was also included. In order to eliminate additive effect of studies using same data source, special effort made to only include the study with the largest sample size.

This systematic review followed MOOSE guideline for observational studies (Citation22). A search strategy was employed for MEDLINE database (Appendix 1) and was adopted for other databases. A hand search and citation-tracking were also conducted.

To ensure consistency in literature review of utility elicitation methodology, general recommendations of the Peasgood and Brazier (Citation9) were followed. EndNote X7.3.1 was used to download citation, and to identify and extract duplicate studies.

Search methods

The systematic review of the literature on utility values for COPD was part of a wider systematic review of economic evidence on COPD, related pharmacological and psychological interventions and progression modelling for patients with COPD. The following electronic databases were searched for relevant articles: MEDLINE, EMBASE (for the period of 1898–2015), Web of Science, CINAHL, ProQuest (which includes PsycINFO and other 61 databases), the Cochrane Library Database (which includes NHS Economic Evaluation Database, Health Technology Assessment Database, Cochrane Database of Systematic Reviews and other three databases), International Society for Pharmacoeconomics and Outcomes Research (ISPOR) and Google Scholar. An attempt was made to decrease the likelihood of publication bias (Citation23) by using dissertation and web sites of key academic institutions such as NICE (National Institute of Clinical Excellence), CCOHTA (Canadian Cooperating Office for Health Technology Assessment), SBU (The Swedish Council on Technology Assessment in Health Care), Health Economic Evaluations Database (HEED, ceased publishing in 2014) and the Cost Effectiveness Analysis Registry at Tufts-New England Medical Centre.

Data extraction and management

Data from included articles were extracted into Excel and Stata spread sheets. The following variables were obtained from each citation: principal author, year of publication, clinical characteristics and demographic of patients, number of patients, country of origin, study design, data collection method, health state utility value measure and utility estimate (mean and standard deviation). In intervention studies, such as randomized control trials, baseline characteristics were used to avoid the potential effect of the intervention on the quality of life estimates. When a demographic or clinical factor splits intervention groups, the entire number of the whole was used where possible. Assessment of study eligibility and extraction of information from each study were carried out by two independent reviewers.

Data analysis

In order to estimate a single mean utility score value for COPD, a meta-analysis was conducted. This was done for COPD as a general condition and for the stages of the disease separately. Point estimates and 95% Confidence Intervals (CI) for utility scores were calculated and displayed in forest plots. Possible publication biases were investigated using funnel plots. Meta-analysis was restricted to EQ-5D Index-elicited utility values, as this was the only utility measure that existed in sufficient numbers for it to be feasible to undertake a meta-analysis. This restriction avoided heterogeneity imposed by elicitation methodology diversity (Citation9).

Meta-analysis was conducted with the command metan (Citation24), using Stata version 13.1. The within-study variability was considered through incorporating random effects model and a mean of a distribution of true effects was estimated. Heterogeneity among the studies was measured using I2 statistic = 100% × (Q - df) ⁄Q and 95% CI, indicating the proportion of observed variance due to real differences in utility scores rather than sampling error. Values of 30%–60%, 50%–90% and 75%–100% were considered as moderate, substantial and considerable heterogeneity. If standard errors of utility values were not reported, they were calculated from 95% confidence intervals or standard deviations. If any study did not present enough data for measuring standard error, it was excluded. metabias and metafunnel commands were used to perform the Egger regression asymmetry test for publication bias and draw the funnel plot (Citation25, 26). To demonstrate influence of outlier studies on the overall meta-analysis metaninf command was used.

To conduct pre-specified subgroup analyses, study variables including clinical/participant and conduct of study factors were selected to define subgroups as follows: age, gender, FEV1% predicted, pack-years number of cigarette smoking, number of patients per study, Hospital Anxiety and Depression Scale (HADS) depression index, Borg dysphonia index, Charlson co-morbidity index, level of literacy, length of COPD and Body-mass index, airflow Obstruction, Dyspnoea, and Exercise capacity (BODE) index scores. Interaction tests were conducted only if there were at least two studies in each of the subgroups. Meta-regression was abandoned because of insufficient number of studies in some subgroups. Interaction models to some subgroups of interest were applied and changes in magnitude or direction of the utility values and heterogeneity were reported. The t-test and analysis of variance (ANOVA) were applied for comparing estimated utility means between subgroups.


Study characteristics

The flow diagram () summarises the selection process of articles to be included. The initial pool of studies comprised 17,565 entries, including three citations captured through hand search (Citation27–29). Of these, 17,570 were excluded after scanning of abstracts. Full text examination of 404 studies was conducted and, after incorporating inclusion and exclusion criteria, 78 studies were selected for review. Thirty-two studies with 49 observations gave estimates of general utility values for COPD population as a whole. Included articles in meta-analysis are tabulated in . To adhere to Cochrane handbook recommendation on including studies with multiple intervention groups (multiple observations) in a particular meta-analysis, observations of a single study were combined to create a single value.

Table 1. Characteristics of studies included in meta-analysis.

Table 2. Utility values estimated in included studies.

Figure 1. Flow diagram for papers included in meta-analysis.

Figure 1. Flow diagram for papers included in meta-analysis.

Seventeen studies reported utility values for some COPD stages (including 10 studies that only reported utility values for stages of COPD) (). One study (Citation20) used British Thoracic Society (BTS) staging system based on Medical Research Council (MRC) dyspnoea scale. Because of similarity in definition of stages I, II and III in this scaling with stages II, III and IV of GOLD staging system respectively, the equivalent utility values were incorporated in meta-analysis. One study (Citation68) used American Thoracic Society staging system (ATS) 1987. Due to similarity in definition of stages II (moderate) and III (severe) in this scaling with stages III and IV of GOLD staging system respectively, the equivalent utility values were incorporated in meta-analysis. One study (Citation46) followed the GOLD staging definition but it merged stages I and II of COPD patients into one single moderate (II) stage and attributed one single utility value for these groups. Utility value of stage II of this study was omitted from meta-analysis. In one study (Citation55) the ‘severe’ (GOLD-stage III) and ‘very severe’ (GOLD-stage IV) subsets were merged into one single ‘severe’ (stage III) subset. Utility value of stage III of this study was omitted from meta-analysis.

Table 3. Values of utility according to the Spirometry staging and COPD severity staging system in included studies.

Approaches and measures in COPD

Three studies (four observations) were omitted (Citation28, 70, 71) from the final analysis due to reporting very extreme EQ-5D elicited utility values (<0.008 and >0.96). Attempts were made to contact these authors but the explanations provided did not fully clarify the reasons for the extreme values. The number of participants for general utility scores ranged from 41 to 4803, with an average of 779. Of these, 63.62% were male and the weighted average age was 66.0 years. The weighted average FEV1% predicted was 45.61 (95% CI 49.518 to 50.103), which indicated severe airflow obstruction according to GOLD guidelines (2011) (Citation72). Mean pack per year smoking cigarette was 44.90. Identifying specific COPD co-morbidities was not possible. Five studies reported the Charlson co-morbidity index.


Forest plot

represents 32 utility values ordered by date of publication. The mean utility value estimated from random effect meta-analysis was 0.673 (95% CI 0.653 to 0.693). There was substantial heterogeneity in the utility values: I2 (variation in ES attributable to heterogeneity) = 97.7%, heterogeneity chi-squared = 1348.12, degree of freedom = 31, p < 0.001 and estimate of between-study variance Tau-squared = 0.0029.

Figure 2. Forest plot (random effect) of utility values for COPD patients, general utility values, effect size.

Figure 2. Forest plot (random effect) of utility values for COPD patients, general utility values, effect size.

Funnel plot

There was evidence of potential publication bias in this meta-analysis based on Begg's funnel plot () and on Egger's test (p value < 0.001), but it should be noted that when between-study heterogeneity is large, none of the bias detection tests work well (Citation73). Test of influence of an individual study on the overall meta-analysis estimate, metaninf, did not show significant outliers

Subgroup analyses -interaction tests

The mean utility values for each state of COPD disease estimated from random effect meta-analysis are presented in and . The estimated utility value for stage I was 0.820 (95% CI 0.767 to 0.872) and the value constantly declined by increasing the severity of disease; 0.782, 0.721 and 0.624 for stages II, III, and IV respectively. Tests of difference between estimated utility means () rejected hypothesis of equality of means between stages of COPD, especially between stages II against III and stages III against IV.

Table 4. Estimated mean utility values in general and four stages of COPD (95% confidence interval).

Table 5. Difference between estimated utility value means in subgroups.

Figure 3. Forest plot (random effect) of utility values for COPD, stages utility, effect size.

Figure 3. Forest plot (random effect) of utility values for COPD, stages utility, effect size.

Characteristics of study populations

After performing subgroup analysis, there was no evidence of difference in heterogeneity of estimated utility value with age groups of the patients, which was available for all the included studies (). Some evidences in favour of the effect of study type and cigarette pack-per-year on estimated utility mean were captured (one tailed t-test, ).

Table 6. Results of interaction tests for subgroup analyses.

Other study characteristics

The interaction tests did not suggest any evidence of difference in utility value and heterogeneity index between the subgroups for country of origin. Interestingly, the general utility value showed a quadratic distribution across year-of-publication (). Interaction tests revealed significant change in utility value among groups of year-of-publication but the heterogeneity was remained constant. Utility value was high in studies before 2008, followed by a decline in 2009 to 2011 and a raise in 2012 to 2015. The t-test and ANOVA tests confirmed this trend and the differences ().


This study aimed to summarize utility measures used in COPD and estimate mean utility value for these patients taking the sources of heterogeneity of included studies into account. Thirty-two studies were captured. They reported utility values of COPD based on patient level data. Cross-sectional studies were the dominant type of published studies (nineteen studies). There were in addition, 13 Randomized Control Trial studies. A meta-analysis, controlled for between-study variation, random effect model, calculated mean utility value of 0.673 (95% CI 0.653–0.693) for COPD patients.

This systematic review has revealed substantial diversity in the measuring instrument of HSUV used, and a wide range of utility values in COPD. The utility values ranged from 0.820 (95% CI 0.767–0.872) for stage I to 0.624 (95% CI 0.571–0.677) for stage IV. The meta-analysis indicated a high degree of heterogeneity in utility that was not explained by other factors. The utility score observed in this study is considerably lower than utility score in a general population-based sample, which suggests major impact of COPD on HSUV. For example, a U.S. population-based survey reported a mean utility value of 0.87 (Citation74) on EQ-5D scale. Another representing study from Alberta, Canada, reported a mean utility of 0.91 for individual with no medical problems in a general population survey (Citation75). Similarly, a study presented value set of general population norm of EQ-5D-3L utility value in Queensland, Australia, reported a value of 0.87 (0.86–0.87) (Citation76).

It is well-known that there is inter-instrument variation in the estimation of health utility (Citation77). For this study, in order to reduce diversity and make precise estimation of utility score, meta-analysis was confined only to EQ-5D Index measure. Nevertheless, there was significant utility value diversity between studies which utilized EQ-5D measure (I2 = 97.7%).

Clinical and study methodological diversity can both produce heterogeneity, though disaggregation of effects between the two is sometimes very difficult. Patients may be more willing to express the severity of impairment in self-administered than in interviewer-administered questionnaire (Citation78) but the current study did not find evidence against null hypothesis of similarity between these two study subgroups.

Although some included studies did not report spirometry results (40.6%), almost all of them clearly mentioned that COPD diagnostic guidelines were considered and spirometry tests were performed, not only through the registration process (when COPD patient samples were recruited from registry data bases) but also by investigators as part of inclusion criteria. For two studies (Citation20, 51) it was based on General Practitioner diagnosis. An interaction test was performed with subgroup analysis of studies which reported and not reported FEV1% preb value (). The test result could not reject null hypothesis of similarity between the two groups. In both groups heterogeneity was very significant and estimated mean utility value were similar.

This study did not show any association between degree of airflow obstruction (FEV1% pred) and general utility score. This may be explained by the chronic nature of COPD that leads many patients to adjust their lifestyle in accordance with their daily living ability and minimize their sense of functional impairment (Citation79). Another possible reason is related to the limitation of preference-based measures in measuring HSUV in COPD disease. It has been shown that these measures have some limitations in tracing the impact of a disease over time, due to the floor effects with the SF-6D and ceiling effects with the EQ-5D (Citation80). Guyatt et al. (Citation81) pointed out that responsiveness of generic measures to treatment effects in randomized trials in chronic respiratory disease is likely to be limited and may not be valid for measuring longitudinal differences over time. Hesselink et al. (Citation82) reported that changes in FEV1% pred was weakly correlated with HSUV changes during a 2-year follow-up of COPD patients. These findings were consistent with the results of previous studies (Citation83–85), which implied clinical measures such as FEV1% pred provided limited information about health condition and were not well correlated with health status of COPD patients. Consistent with these evidences, the new approach of the updated 2014 GOLD report suggests that progression and severity of the COPD disease cannot be drawn in a single-shot picture using only one diagnostic criteria and a combined COPD assessment is needed for prognosis of the disease (Citation72). The combined assessment approach takes three elements into consideration: spirometric test, risk of exacerbations and one of the following disease-specific Hr-QoL measures: COPD Assessment Test (CAT) or COPD Control Questionnaire (CCQ). This method, in conjunction with an assessment of potential co-morbidities, provides a better approach for COPD staging and individualization of the disease management.

Given the current state of knowledge three systematic literature reviews of utility values for COPD disease were published. (Citation79, 86, 87). The aim of these studies was to summarize utility/disutility values in COPD by severity of the disease. Due to the following methodological variations their estimations were different from the current study: 1) In two of these studies, estimated mean utility values for stages of disease were derived from simple mean calculation without incorporating variances around utility values in each included study; in other word meta-analysis was not statistical approach. 2) The current study performed a more comprehensive and, up-to-date systematic literature review and captured more valuable studies for the general and stage specific utility values. 3) In the current study appropriate statistical tests were used to demonstrate sources of heterogeneity and differences in estimated utility values by sub-group analyses. 4) The current study tried to adhere to general recommendations of Peasgood and Brazier (Citation9) in selection of included studies and running meta-analysis.

Another five literature reviews were captured that focused mainly on QoL and outcomes considering variety of interventions in COPD (Citation83, 88–91). The most recent literature review (Citation91) was a qualitative study covering humanistic and economic burden of COPD. In the humanistic section, the study focused on 32 non-RCT studies, of which almost 30% of them were conference abstracts. Different types of HR-QoL measures were included. No quantitative analyses were carried out by this study. Some suggested associations between study characteristics and patient conditions such as demographic, disease symptoms, co-morbidities, resource use and cost were proposed. This study recommended that a comprehensive quantitative study is needed for a reliable conclusion.

In comparison with the findings from the past, current systematic literature review has significant clinical and research implications. In reference to Peasgood and Brazier's critical paper (Citation9) this study tried to overcome major concerns related to meta-analysis of utility estimates in chronic diseases. Very restricted inclusion and exclusion criteria (such as excluding values that were not the appropriate utilities) were applied to capture unbiased study population. Especial attempted were made to generate a pool of utility values elicited from similar health state of COPD patients population. Adopting EQ-5D as the only elicitation method ensured consistency in methodological estimation of utility. All available study characteristics were reported transparently and justification for choosing data from studies were clearly explained. So, modellers can choose the most appropriate estimated value.

There are a few limitations applied to this research. First, the form of aggregated data (study level not individual information) assembled in this study meant that it was not possible to do a more comprehensive meta-regression analysis to investigate correlation of study characteristics (Citation48), demographic diversity (Citation44, 51, 92), clinical staging (Citation25, 53, 93) or health condition differences such as co-morbidities with heterogeneity. Second, COPD patients have a higher prevalence of osteoporosis, anxiety/panic attacks, heart trouble, heart attack, and heart failure, than smokers or non-smokers general population (Citation94, 95). Co-morbidity measured by Charlson Index was only considered by five studies that were included (Citation41, 43, 44, 47, 52). Third, the review did not include non-English language publications unless English versions of their abstracts were available.

For the future research, consideration of specific limitations of some HSUV measure instruments (e.g., celling effect and limited sensitivity in EQ-5D) are essential; using EQ-5D-5L instead of EQ-5D-3L may overcome this limitation.

In conclusion, this study shows considerable inconsistency in utility measures among COPD related published literature. It confirms that the utility value in COPD is considerably lower than the general population. However, the effects of contributing factors such as spirometry assessment and co-morbidities on utility value remain largely unclear. This paper suggests that careful consideration should be taken into account when using systematic method (meta-analysis) for calculation of input parameters in health economic analysis. In case of high level of heterogeneity, appropriate sensitivity analyses are recommended for more accurate health economic appraisals.


The authors thank Ms. Rachel Sore (Statistical Consulting Centre, University of Melbourne) for her contribution for statistical analysis.

Declaration of interest statement

The authors report no conflicts of interest. All the authors interpreted data, read and approved the final manuscript. The authors alone are responsible for the content and writing of the paper.


The first author, Foruhar Moayeri, received PhD scholarship funding from University of Melbourne Faculty of Medicine, Dentistry and Health Sciences.


Appendix 1

Table A1. Summary of Medline search strategy.

Appendix Figure A1. Funnel plot of general utility values, included studies of COPD.

Appendix Figure A1. Funnel plot of general utility values, included studies of COPD.

Appendix 2: Excluded citations

