3,334
Views
189
CrossRef citations to date
0
Altmetric
Invited Symposium

Measurement Properties and Interpretability of the Chronic Respiratory Disease Questionnaire (CRQ)

, Ph.D. , M.D., , M.D., , M.D., , M.Sc. , M.D. & , M.Sc. , M.D. , F.R.C.P.C.
Pages 81-89 | Published online: 24 Aug 2009

Abstract

The chronic respiratory questionnaire, available as an interviewer and a self-administered instrument, includes 20 items across four domains: dyspnea (5 items), fatigue (4 items), emotional function (7 items), and mastery (4 items). When completing this instrument, patients rate their experience on a 7-point scale ranging from 1 (maximum impairment) to 7 (no impairment). The Chronic Respiratory Questionnaire has demonstrated excellent measurement properties for both discriminative and evaluative purposes and served as a model in numerous methodological studies in chronic airflow limitation and patients with chronic obstructive pulmonary disease. We performed a systematic review of the literature on the chronic respiratory questionnaire to summarize the key qualities of the chronic respiratory questionnaire and to appraise the work regarding the minimal important difference of the chronic respiratory questionnaire. This paper includes a revision of our initial definition of the minimal important difference and a methodological framework for using anchor based approaches to establish the minimal important difference pioneered by Jaeschke and colleagues. Other approaches to evaluate the minimal important difference include distribution-based methods and panel-based methods. Investigators have used all of these approaches to establish the minimal important difference for the chronic respiratory questionnaire and the results are in general agreement with the minimal important difference of 0.5 for the mean domain scores of the chronic respiratory questionnaire. As a result of this literature review and discussion at the workshop, we established several research objectives. These objectives include the exploration of presentation of quality of life information and prospective anchor-based approaches.

Introduction

For nearly 20 years, the Chronic Respiratory Questionnaire (CRQ) has remained one of the most widely used health related quality of life (HRQL) instruments in patients with chronic obstructive pulmonary disease (COPD) Citation[1-8]. The original CRQ, an interviewer-administered instrument, includes 20 items across four domains: dyspnea (5 items), fatigue (4 items), emotional function (7 items), and mastery (4 items). When completing this instrument, patients rate their experience on a 7-point scale ranging from 1 (maximum impairment) to 7 (no impairment). Validated versions exist in many languages Citation[[9]]

The dyspnea domain of the original CRQ uses an individualized approach from which patients choose 5 important daily activities (individualized dyspnea items) and report their degree of dyspnea during those activities. The individualized dyspnea domain, while theoretically appealing, potentially increases the complexity and burden on patients and interviewers, thus compromising feasibility in large clinical trials Citation[10-14].

While both interviewer administration and individualization compromise efficiency, they may prevent missing responses and errors in completing the CRQ and potentially enhance validity and responsiveness (the ability of the CRQ to detect important changes, even if those changes are small) Citation[[15]]. Recently, two randomized trials and three observational studies evaluated whether self-administration and use of standardized rather than individualized dyspnea items maintained measurement properties of the CRQ Citation[15-20]. These studies indicate that self-administration did not substantially increase missing items and did not impair the measurement properties (validity and responsiveness) of the CRQ Citation[16-20]. Standardization of the dyspnea domain reduced responsiveness only slightly while increasing construct validity Citation[15&16]Citation[[20]]. Thus, there is now a validated self-administered and standardized version of the CRQ (CRQ self-administered standardized or CRQ-SAS) available for use in clinical studies Citation[[16]]Citation[[20]]. However, if responsiveness is the primary objective of a study and sample size limited, use of the individualized dyspnea domain remains an attractive choice Citation[[16]]Citation[[20]].

In addition to its use as a HRQL outcome measure, the CRQ served as a model in numerous methodological studies in chronic airflow limitation and chronic obstructive pulmonary disease (COPD) Citation[21-24]. Using the CRQ, Jaeschke et al. defined and conducted the initial evaluation of the minimal important difference (MID) using an anchor-based approach with global ratings of change Citation[25&26] that later studies replicated Citation[[27]]. We revised our initial definition of the minimal clinical important difference (MCID), removed the focus on “clinical” interpretations and now define the MID as: The smallest difference in score in the outcome of interest that informed patients or informed proxies perceive as important, either beneficial or harmful, and that would lead the patient or clinician to consider a change in the management. We place a greater weight on the preferences of (informed) patients than clinicians in studying the MID. To further qualify this definition of the MID, only if the MID for informed patients is unknown, if informed patients cannot make decisions about the management of their disease, or if patients prefer informed proxies to make these decisions would one consider the MID estimates of informed proxies. In addition, any change in management will depend on the downside, including cost, associated with that outcome and the values and preferences patients place on these outcomes.

Anchor-based methods rely on examining the associations between scores on the instrument that is under investigation and an anchor, an independent measure of HRQL that clinicians can easily interpret Citation[[28]]. Because the estimates of the MID differed only slightly from 0.5 on the 7-point scale in several studies, the studies concluded that the MID for the CRQ and other instruments using 7-point scales is 0.5. The rounding of the MID to 0.5 should facilitate its use by those interpreting HRQL changes.

Other approaches to evaluate the MID include distribution-based methods and reliance on experts (panel-based methods) Citation[[29]]. Investigators have used both of these approaches to establish the MID for the CRQ and the results are in general agreement with the MID of 0.5 for the mean domain score Citation[21&22].

This paper has several aims. In line with the goals of the MID workshop, we first summarized the available evidence about the key qualities of the CRQ after conducting a systematic review of the literature. Second, we used the systematic review to summarize other available evidence about the CRQ and in particular the MID of the CRQ. Third, we provide an evaluation of global ratings that investigators use to establish the MID. Fourth, we provide initial information on the MID of the recently developed CRQ-SAS. Fifth, we provide a summary of the workshop's outcomes on defining future goals for the investigation of the CRQ's MID.

Methods

Systematic Review of the Literature

We conducted a systematic review of the literature on the CRQ. We searched electronic databases (MEDLINE, EMBASE, and the Cochrane Database of Systematic Reviews), our own files and the reference lists of articles identified during the search from 1966 until December 31, 2003. We used the following keywords: chronic respiratory (disease) questionnaire, CRQ and CRDQ. In addition, we used the related articles feature in PUBMED for selected publications to search for additional papers and a forward citation search of the original manuscript describing the development of the CRQ Citation[[5]].

Based on the results of the literature search, we also identified studies that compared the responsiveness of the CRQ with other HRQL instruments in COPD to describe the key qualities of the CRQ. We only included studies that used appropriate statistical comparisons of the instruments, such as the standardized response mean (SRM) as measure of responsiveness and also reported data on the St. George's Respiratory Questionnaire (SGRQ) because the latter is another widely used and accepted measure of HRQL in COPD. The standardized response mean adjusts measured HRQL changes for the variability of the measurement by dividing the change score by the standard deviation of the change score.

MID of the CRQ-SAS

For the evaluation of the MID of the CRQ-SAS we used our standard methodology based on global ratings of change(transition ratings) Citation[26&27]Citation[[30]]. We were interested in the change scores between the baseline and follow-up administration of the CRQ. We considered each domain separately and calculated the differences (delta, Δ) between mean scores from the baseline and follow-up visit for the individual domains from two studies that allowed us to explore the MID (Schünemann et al. and Puhan et al., manuscripts in preparation). In brief, the first was a study (study 1) in 177 patients who underwent respiratory rehabilitation and were randomized to complete the original CRQ (n = 86) or the CRQ-SAS (n = 91) before and 12 weeks after starting a standard 8 week respiratory rehabilitation program Citation[[16]]. In that study, patients also completed the global ratings of change. The second study (study 2) enrolled German speaking patients with COPD who underwent an intense pulmonary rehabilitation program of 2 to 3 weeks and completed the original CRQ (n = 38) or the CRQ-SAS (n = 37) at baseline and follow-up (at end of intense rehabilitation program) Citation[[17]]. In this study, 18 patients randomized to the original CRQ and 17 randomized to the CRQ-SAS also completed global ratings of change.

We began by examining the correlations between change scores of the CRQ domains (target instrument) and the respective transition ratings (global ratings of change or GRΔ). We based our decision to use transition ratings as anchors to determine the MID on a priori criteria for their validity and excluded those transition ratings from the analysis that did not meet all criteria Citation[[28]]Citation[[31]]. The a priori criteria we suggest for use with transition ratings were the following:

  1. Transition ratings should correlate > 0.5 with change scores of the instrument for which one intends to establish the MID because an external anchor provides only a valid MID estimate if the correlation between the target instrument and the external anchor is sufficiently high.

  2. Transition ratings correlation with the target instruments' baseline scores should be ≤ 0.

  3. Transition ratings should correlate positively with follow-up scores.

  4. Correlations with follow-up scores should be at least 0.2 smaller than the correlation between transition ratings and change scores of the target instrument.

If these criteria were fulfilled, we would use linear regression models with the transition score as the independent variable to calculate the respective CRQ domain change scores that represent small, moderate, and large clinically important change. Small changes correspond to a GRΔ of 2 or 3, moderate changes correspond to a GRΔ of 4 o5, and large changes correspond to a GRΔ of 6 or 7. We calculated the standard error of measurement of the CRQ domains by using the formula “standard deviation at baseline × (square root [1-reliability coefficient])” using estimates for Crohnbach's α for each domain of the baseline CRQ scores as the reliability coefficient or conservative estimates from the literature review Citation[[22]].

Results

Literature Search

The search strategy for studies exploring the CRQ revealed the following results: we obtained 162 citations from the Medline search that listed the keywords CRQ, CRQD, chronic respiratory disease or chronic respiratory disease questionnaire in the abstract or title and an additional 445 references from the search of the web of science cited references. After electronically excluding duplicates from the latter two searches, 563 citations remained. Another 35 citations were duplicates not removed electronically or included obviously irrelevant titles. We screened the remaining 528 references for data relevant to the described aims. Of these articles, five investigated the MID of the CRQ Citation[21&22]Citation[[26]]Citation[[30]]Citation[[32]].

Key Qualities of the CRQ

Reliability

shows the results of studies investigating the reliability of the CRQ in several studies. The intra-class correlation coefficient (ICC) for test–retest reliability consistently ranged from 0.73 to 0.95 Citation[10&11]Citation[[20]]Citation[33&34]. Internal consistency reliability (Cronbach's α) ranged from 0.53–0.84 for individualized dyspnea domain and 0.81–0.90 for other domains Citation[[1]]Citation[[11]]Citation[[20]]Citation[34&35]. The table also shows data from a systematic review of randomized trials in respiratory rehabilitation and from a study of fluticasone propionate in Dutch patients with mild COPD showing the mean change in CRQ scores by domain Citation[[1]]Citation[[35]]. These data indicate that the CRQ measures changes in response to intervention that improve HRQL in patients with COPD.

Table 1.  Reliability and Responsiveness of the CRQ.

Responsiveness

shows the relative responsiveness expressed as standardized response mean of the CRQ and CRQ-SAS in comparison to other HRQL measures in COPD. The study by Harper et al. investigated the relative responsiveness in 156patients with COPD Citation[[34]]. The table also shows data from two studies we conducted in patients undergoing respiratory rehabilitation (n = 84 for the CRQ, n = 91 for the CRQ-SAS, Schünemann et al., manuscript in preparation). Overall, these SRMs indicate that the CRQ is comparable, if not superior, to other instruments in detecting change in HRQL.

Table 2.  Relative Responsiveness (Standardized Response Mean) of the CRQ and Other Instruments.

Validity

shows results of studies investigating the validity of the CRQ. The data indicate moderate cross-sectional and change score correlations with physical measurements and high cross-sectional and longitudinal construct validity compared to other COPD specific and generic HRQL instruments.

Table 3.  Cross-Sectional and Longitudinal Construct Validity of the CRQ (Selected Measures).

Predictive Properties of the CRQ

Two small studies investigated whether scores on the CRQ predict mortality in patients with COPD Citation[36&37]. In the study by Gerardi and colleagues Citation[[37]], scores on the CRQ dyspnea domain predicted mortality even after adjustment for other covariates in 158 COPD patients followed for an average of 40 months. In a study by Oga et al. Citation[[36]] that suffered from a relatively high loss to follow-up (9%) and low event rates, scores on the dyspnea and emotional function domains and the CRQ total score were statistically significant predictors of all-cause mortality even in patients with mild HRQL impairment. However, this association did not achieve statistical significance after adjustment for age, BMI, FEV1, and other variables. The authors did not describe the variable selection procedure in their regression model and they unconventionally included other variables even if they were not statisticallysignificant in the multivariable model. Thus, the model selection may have led to unstable estimates of the predictors. Other important limitations of that study are the lack of information on change in therapeutic management and lack of adjustment for cumulative measure of tobacco exposure.

Methodology and Outcome of the MID for the CRQ

Anchor-Based Approaches—Patient Perspectives

Within Patient Global Ratings

Jaeschke and colleagues Citation[[26]] conducted one of the first studies tackling the problem of the MID. Data from 3 clinical trials contributed to the exploration of the interpretability of the CRQ and the Chronic Heart failure Questionnaire (CHQ), an instrument that is almost identical to the CRQ but targets patients with chronic heart failure. In each of these studies, patients completed the CRQ or CHQ at each clinic visit. On all visits but the first, they also completed global ratings of change in their shortness of breath on day-to-day activities, their fatigue level, and how they were feeling emotionally. Patients identified whether they felt better, about the same, or worse. If they felt worse or better, they quantified the change using the following global rating scale: 1 indicates almost the same, hardly any worse/better at all; 2, a little worse/better; 3, somewhat worse/better; 4, moderately worse/better; 5, a good deal worse/better; 6, a great deal worse/better; and 7, a very great deal worse/better.

Using the original definition of the MID, Jaeschke et al. Citation[[26]] classified ratings of 1 to 3 as small changes in function as MID. Changes of 4 or 5 represented moderate changes, and 6 or 7 large changes. They noted the corresponding change in the appropriate CRQ or CHQ domain (dyspnea with dyspnea, fatigue with fatigue, emotional function with emotional function) from the previous visit. The mean change in score per question corresponding to a small difference (MID) was consistently around 0.5 ().

Table 4.  Minimal Important Difference (MID) of the CRQ.

Between Patient Global Ratings

Redelmeier et al. Citation[[38]] used an approach similar to the within patient global ratings. This approach relied on between-patient ratings in 112 patients who completed the CRQ and were assigned to groups of 5 to 13 patients and discussed their problems in pairs. After extensive discussions, patients rated their problems on the four CRQ domains as the same, worse (to varying degrees), or less severe (to varying degrees) than the individual with whom they have spoken. Patients performed the latter rating privately and not informed about the rating of the corresponding patients. The approach assumed that the difference in score between patients who rate themselves “a little better” or “a little worse” constitutes the MID. The MID for the individual domains varied considerably because of the small sample size but the pooled MID across domains was 0.42 (95% confidence interval 0.32 to 0.53) (). This approach, while practical and patient focused is limited in that patients may have difficulties describing health status to one another, particularly in areas related to emotional function. In addition, when treating patients, clinicians are interested in the within-patient change over time, and within- and between-patient or MIDs may differ.

Anchor Based Approaches—Clinical or Life Events as Anchors

Van den Boom et al. Citation[[39]] evaluated 384 previously undiagnosed patients who were diagnosed with mild COPD after screening as part of the Dutch Monitoring of COPD and Asthma program (DIMCA) program with the CRQ. The authors found and confirmed that 99 of these patients had consulted a general practitioner for respiratory problems in the past but remained undiagnosed. The mean score of the CRQ domains was lower in these 99 patients compared to the remaining undiagnosed patients on all of the domains (mean difference for dyspnea = 0.65, p < 0.001; fatigue = 0.56, p < 0.002; emotional function = 0.31, p = 0.011 and mastery = 0.25, p = 0.002).

Distribution Based Methods

Standard Error of the Measurement

Wyrwich and colleagues followed 471 outpatients patients with COPD and used the standard error of the measurement to correlate this distribution based method with the MID Citation[[22]]. The authors found that the SEM method consistently suggested an MID of the CRQ of approximately 0.5 (). In addition, the research revealed that this methodology reveals consistent estimates for the MID across a wide range of HRQL scores on the CRQ and that there was no evidence for important ceiling and floor effects.

Cohen's Effect Size

We previously described the MID for the CRQ based on Cohen's effects size in a sample of 51 patients completing a respiratory rehabilitation program Citation[[30]]. According to Cohen, 0.2 SD units represents a small effect, 0.5 SD units a moderate effect, and 0.8 SD units a large effect Citation[[40]]. The CRQ scores that corresponded to 0.2, 0.5, and 0.8 standard deviation units were as follows: CRQ dyspnea 0.24, 0.61, and 0.98; CRQ fatigue 0.27, 0.67, and 1.08; CRQ emotional function 0.24, 0.60, and 0.96; CRQ mastery 0.24, 0.60, and 0.96.

Panel Based Methods

Wyrwich and colleagues assembled an expert panel of HRQL investigators including specialists and generalists to establish a consensus based MID for the CRQ Citation[[21]]. All panel members (n = 9) were familiar with the CRQ, received information about the CRQ and material about the previously established MID using for the instrument. The authors used a 2-round Delphi process, an in-person meeting and subsequently an iterative modification of a consensus document that reflected the panel's opinion about the MID of the CRQ. The strategies panel members suggested for determining the MID could be assigned to one of the following four categories: 1) considering change scores of their own patients; 2) apply proportions of patients to scores on the CRQ; 3) triangulation of results from earlier studies describing the MID Citation[[22]]Citation[[26]]Citation[[32]] or 4) use of Cohen's effect size.

MID of the CRQ-SAS

Global Ratings

Although most correlations in studies 1 and 2 between the GRΔ and the corresponding CRQ domains were greater than 0.5 (fulfilling criterion 1), preliminary analysis indicated that none of the correlations between the GRΔ and the CRQ fulfilled the other a priori criteria in study 1. We did not observe negative correlations between the pre-treatment scores and the GRΔ in study 1 in which the time period between baseline and follow-up evaluation was approximately 12 weeks. In study 2, the correlations were in the appropriate direction but in general did not fulfill the criteria in regard to magnitude of the correlation except for the emotional function domain. However, the sample size for the analysis of the MID for the emotional function domain was small (n = 3) and yielded confidence intervals that did not allow precise estimation of the MID.

Standard Error of the Measurement

shows the MID for the CRQ-SAS based on SEM methodology for which sample size considerations are of less concern when compared to the anchor-based approach. The MIDs are consistent with the estimates for the original CRQ of approximately 0.5 per domain.

Table 5.  Minimal Important Difference (MID) of the CRQ-SAS Based on the Standard Error of Measurement (SEM) Approach.

Discussion

This review describes the key measurement properties of the CRQ, the methodology and outcomes of studies that determined the MID for the CRQ and provides preliminary data for the MID of the CRQ-SAS based on the SEM.

The existing studies that evaluate the MID for the CRQ consistently show estimates around 0.5 for the MID on all domains of the CRQ. Because the estimates arrive from different methods, including anchor-based methods, distribution based methods and panel based methods, we have confidence in these results. While the true value of the MID will vary slightly with the methods used and may not be exactly 0.50, this number will aid interpretation and communication of results. The strength of our evaluation lies in the systematic review of the available studies regarding the CRQ. This manuscript also describes the key qualities of the original CRQ and the CRQ in self-administered standardized format (CRQ-SAS). Data from two randomized studies comparing the original CRQ with the CRQ-SAS show that the CRQ-SAS maintains the excellent measurement properties of the original instrument Citation[[17]]Citation[[41]].

As expected, the estimates for the MID of the CRQ-SAS are similar to those of the original instruments. There is little reason to believe that the MID should differ by administration mode (self-versus interviewer administered). In fact, we derived MID estimates for the CRQ-SAS from the SEM described by Wyrwich et al. Citation[[22]] that were similar to theMID of the interviewer administered version. Since in two of our studies the correlations between the GRΔ and the CRQ-SAS domains did not fulfill our a priori criteria for transition ratings or had small sample size, additional work is required to determine the MID for the CRQ-SAS based on these transition ratings.

Future Directions

In general, our view of the MID is one that is strictly related to patients' experiences and establishing MIDs for physiologic outcomes is, therefore, rarely of importance unless these outcomes are closely correlated with patient-important outcomes Citation[[42]]. Thus, MID research should focus on patient-important outcomes. As a result of our current review of the literature and the workshop discussions, we established the following research objectives.

CRQ

Because of the frequent use of the total score of the CRQ but yet unexplored properties of using the total score of this instrument, we consider the exploration of total score properties an important objective in the work with the CRQ. Because large sample sizes will help to explore this aim and the workshop consensus that exploration of large data sets will be an important objective, we will request data from large studies. Candidate data sets are those described in the study by Mahler et al. Citation[[7]] and Wyrwich and colleagues Citation[[22]].

The second important aim is the establishment of the MID for the CRQ-SAS. Although the preliminary data based on the SEM approach indicate that the MID of the CRQ-SAS is similar to that of the original CRQ we require further studies. Anchor-based methods using scores from other established HRQL instruments such as the SF-36 and SGRQ will allow us to conduct this comparison as will new original research. We described a priori criteria for the use of transition ratings in this manuscript.

Presentation and Interpretation

Little is known how presentation of statistical HRQL information affects patients' and clinicians' decision making. However, a presentation of the proportion or number needed to treat (NNT) of patients achieving varying degrees of treatment benefit relative to patients in the control group is one possible way of presenting results of HRQL outcomes. However, at present there is no information about recall, interpretation, communication and framing of HRQL information. Therefore, we require studies that systematically evaluate these factors. Health decision aids present an important tool to study these issues. In addition to the NNT, absolute risk reduction (ARR), and natural frequencies (e.g., 3 per 1000), the mean time to achieve an MID may be an appropriate way of expressing HRQL outcomes. Furthermore, we need to explore the properties and improve the performance of transition ratings.

Psychometric Properties of Global Ratings

The use of global ratings requires additional qualitative research to investigate the cognitive process that individuals use to retrospectively assess changes in their health over time. Short-term high efficacy studies would be ideal to study these outcomes. Even recovery from exacerbations occurs slowly and may not be an ideal testing ground. Thus, studies about short term recall may have to await the development of efficacious agents in COPD and are currently restricted to respiratory rehabilitation. However, until then one approach may be the use of reminders about past health states to ground the patients ratings in their prior state. Approaches to establish these descriptions include the use of video techniques or patients narratives of their own health collected at baseline. This technique may be defined as “prospective global ratings.”

Validation of MID

Finally, one needs to establish the MID across interventions. It is not clear which intervention-specific experiences patients include in their global ratings, even if performed for a specific domain, and it is possible that the MID varies not only according to the basic methodology used but across interventions. Although the MID should be robust against such influences this question deserves further consideration. For the evaluation of whether small deteriorations are similar to small improvements with the use of global ratings we require large sample sizes and these issues become also important when the MID is evaluated for different degrees of severity.

Summary

In summary, the MID for the CRQ is around 0.5 for the mean domain scores. Future studies should focus on the properties of the CRQ total score and the MID of the CRQ-SAS. Additional work is required about presentation of HRQL information and use of transition ratings ().

Table 6.  Summary Table.

REFERENCES

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.