2,191
Views
2
CrossRef citations to date
0
Altmetric
Methods and Modeling

Reply to the concerns raised by McKenna and Heaney about COSMIN

, , , , &
Pages 857-859 | Received 18 May 2021, Accepted 23 Jun 2021, Published online: 12 Jul 2021

Dear Editor,

We would like to respond to the paper by McKenna and HeaneyCitation1, in which they express their concerns with regard to the COSMIN methodologyCitation2. We appreciate the attention to our work, and more in general, to the quality of outcome measurement instruments. We very much agree on many of the issues raised. Assessing the quality of measurement instruments is a challenging task which requires knowledge of the construct to be measured, the patient population that is being measured, and the methods of quality assessment of measurement instruments. Therefore, the COSMIN guidelines strongly recommend to conduct systematic reviews on outcome measurement instruments by a multidisciplinary team consisting of experts on all these aspectsCitation3. High quality systematic reviews can provide a comprehensive overview of the measurement properties of existing PROMs and supports evidence-based recommendations in the selection of the most suitable PROM availableCitation2. As these reviews are challenging, complex and time-consuming to conduct, we developed a systematic methodology to allow systematic and transparent processing of all evidence. In addition to the COSMIN methodology for conducting systematic reviews on PROMsCitation2, including the COSMIN Risk of Bias checklist for PROMsCitation4,Citation5, we developed the COSMIN Risk of Bias tool for assessing reliability and measurement error for any type of instrumentCitation6, the COSMIN Study Design checklistCitation7 and the COSMIN reporting guideline for studies on measurement properties of PROMsCitation8. Note, that each checklist (often accompanied by a peer-reviewed publication, and a user manual) has its own specific purpose.

McKenna and Heaney point to the importance of using conceptual models when developing measurement instruments and evaluating their measurement properties, and suggest that the COSMIN methodology doesn’t pay enough attention to defining the construct by grounding it in a theoryCitation1. However, in the COSMIN Risk of Bias checklist, the standards 1 and 2 in Box 1 PROM development, refer to the clarity of the construct, and the use of a conceptual model, respectivelyCitation5. Moreover, in the COSMIN methodology defining the construct is the starting point of a systematic review. Authors of reviews should clearly define the scope of the review, in terms of the construct, the target population, and the context of useCitation9. This scope is the reference point for assessing content validityCitation9 and for formulating hypotheses for assessing construct validity or responsivenessCitation3. It is up to the reviewers to decide whether they consider a construct to be clearly described, or whether the origin of the construct is clearCitation9. However, this is difficult to determine. We agree to McKenna and Heaney that we reviewers should generally be more strict in this judgement.

A second point these authors raised refers to the use of unidimensional scales, suggesting that within the COSMIN methodology the importance of this is ignoredCitation1. However, COSMIN also addresses that evidence is required on the unidimensionality of (sub)scales based on a reflective model. In the COSMIN methodologyCitation3,Citation9 each subscale of a multi-dimensional PROM should be considered as a separate measurement instrument. Moreover, internal consistency should be assessed only for unidimensional (sub)scales (Standard 1 Box 4 Internal Consistency), and the assessment of internal consistency (by means of a Cronbach alpha) can only be sufficient when there is proof that the scale is unidimensional (criteria for sufficient internal consistency, see Table 1 of Prinsen et al.Citation2).

McKenna and Heaney clearly explain the advantages of developing PROMs using Rasch or IRT analysesCitation1. Indeed, the CLIQ PROMCitation10 developed by McKenna and colleagues, was developed based on qualitative studies, in combination with Rasch analyses. We agree that the combination of these two methods is the best method for developing instruments. In the COSMIN methodology, we therefore first consider content validity and explicitly label that as the most important measurement propertyCitation2. Furthermore, when assessing the content validity, the development of the PROM is explicitly taken into accountCitation5. Standards to assess the quality of the development of a PROM or a study on content validity refer to qualitative methods. The next step is to assess structural validityCitation2. Therefore, we developed standards and criteria for studies using factor analyses methods, and IRT or Rasch methods. Nevertheless, when evaluating an instrument, we think IRT or Rasch methods and CTT methods complement each other. Both provide different kinds of information on the quality of an instrument. Moreover, the goal of a systematic review of measurement instruments is often to select the most suitable instrument that is available for use. A systematic review will obviously be restricted to existing instruments and studies, which are still often based on CTT. However, in recent years there is an increase in studies using IRT and Rasch methods. We acknowledge that the COSMIN standards and criteria for these methods should be developed further. We also agree with McKenna and Heaney's call to develop instruments based on IRT or Rasch to achieve scales with interval or ratio level data.

Next, McKenna and Heaney suggest that the COSMIN checklist is based on opinion and not on evidence. The COSMIN checklist was developed using Delphi studiesCitation5,Citation6,Citation11. The proposals which were proposed to the experts were based on existing methodological literature. We invited measurement experts to contribute to the development, and asked them after each proposal to give their arguments and/or provide literature suggestions for their opinion. These arguments and methodological literature were definitively considered when developing the standards and criteria. So, while we agree that opinions were indeed the foundation of the COSMIN methodology, we would like to emphasize that these were opinions of experts who weighted and considered the arguments given for these opinions until a consensus was reached.

A legitimate concern McKenna and Heaney raise is the difference in conclusions drawn by different authors of systematic reviews using the COSMIN methodologyCitation1. The question is whether this is due to the COSMIN methodology or to the expertise and use of COSMIN by these authors. The COSMIN Risk of Bias checklist is not a cookbook. Expertise is required and subjective judgement is sometimes needed. Again, we point to the requirement of having a review team with broad expertise, on both the construct and patient population, but also on the methodology of the type of studies that are included, including both CTT and IRT or Rasch based studies. Furthermore, we facilitate users as much as we can by teaching courses of clinimetricsCitation12,Citation13, and we are currently developing an online course for conducting systematic reviews using the COSMIN methodology. To further increase the quality of systematic reviews of measurement instruments, we recommend users of the COSMIN methodology to be transparent in the methods used, including publishing the search strategy that they used to find studies. We subsequently recommend to publish all results found in the included studies (per measurement property per measurement instrument) in extensive tables, for which we provided examples (see Appendix 7 of the user manualCitation3). This will contribute to the transparency of the conclusions drawn, and allows end users of the review to formulate their own conclusions.

In addition to these important points raised by McKenna and Heaney, we would like to clarify a misunderstanding they seem to have about the COSMIN methodology. The final ratings for the COSMIN Risk of Bias checklist are a worst-score count rating per study per measurement propertyCitation14. We did not state that a PROM should have passed 75% of the checklist, as McKenna and Heaney wrongly inferredCitation1. What we recommend to do, after making this overview of all available evidence, is to categorize the instruments, into (A) PROMs that have potential to be recommended as the most suitable PROM for the construct and population of interest; (B) PROMs that may have the potential to be recommended, but further validation studies are needed; and (C) PROMs that should not be recommended (because of high quality evidence for insufficient measurement properties)Citation2.

Assessing the quality of PROMs and other measurement instruments is complex. COSMIN aims to develop a methodology and practical tools to improve the quality of outcome measurement instruments used in research and clinical practice. We welcome suggestions to further improve the COSMIN materials to contribute to that goal and we invite others to join us in the effort.

Transparency

Declaration of funding

No funding was received to produce this article.

Declaration of financial/other relationships

LBM, CBT, and HCWV receive royalties for the book Measurement in Medicine to which we refer in the letter. The royalties are transferred to our university to enable us to do more research.

Acknowledgements

None.

References