1,433
Views
60
CrossRef citations to date
0
Altmetric
Review Article

The cross-cultural equivalence of participation instruments: a systematic review

&
Pages 1256-1268 | Received 09 Dec 2011, Accepted 13 Sep 2012, Published online: 21 Jun 2013
 

Abstract

Purpose: Concepts such as health-related quality of life, disability and participation may differ across cultures. Consequently, when assessing such a concept using a measure developed elsewhere, it is important to test its cultural equivalence. Previous research suggested a lack of cultural equivalence testing in several areas of measurement. This paper reviews the process of cross-cultural equivalence testing of instruments to measure participation in society. Methods: An existing cultural equivalence framework was adapted and used to assess participation instruments on five categories of equivalence: conceptual, item, semantic, measurement and operational equivalence. For each category, several aspects were rated, resulting in an overall category rating of ‘minimal/none’, ‘partial’ or ‘extensive’. The best possible overall study rating was five ‘extensive’ ratings. Articles were included if the instruments focussed explicitly on measuring ‘participation’ and were theoretically grounded in the ICIDH(-2) or ICF. Cross-validation articles were only included if it concerned an adaptation of an instrument developed in a high or middle-income country to a low-income country or vice versa. Results: Eight cross-cultural validation studies were included in which five participation instruments were tested (Impact on Participation and Autonomy, London Handicap Scale, Perceived Impact and Problem Profile, Craig Handicap Assessment Reporting Technique, Participation Scale). Of these eight studies, only three received at least two ‘extensive’ ratings for the different categories of equivalence. The majority of the cultural equivalence ratings given were ‘partial’ and ‘minimal/none’. The majority of the ‘none/minimal’ ratings were given for item and measurement equivalence. Conclusion: The cross-cultural equivalence testing of the participation instruments included leaves much to be desired. A detailed checklist is proposed for designing a cross-validation study. Once a study has been conducted, the checklist can be used to ensure comprehensive reporting of the validation (equivalence) testing process and its results.

Implications for Rehabilitation

  • Participation instruments are often used in a different cultural setting than initial developed for.

  • The conceptualization of participation may vary across cultures. Therefore, cultural equivalence – the extent to which an instrument is equally suitable for use in two or more cultures – is an important concept to address.

  • This review showed that the process of cultural equivalence testing of the included participation instruments was often addressed insufficiently.

  • Clinicians should be aware that application of participations instruments in a different culture than initially developed for needs prior testing of cultural validity in the next context.

Declaration of Interest: The authors report no conflicts of interest.

Appendix 1

Assessment reporting on cultural equivalence (adapted from Bowden & Fox-Rushby, 2003; Herdman et al., 1998)

Appendix 2

Checklist for assessment of reporting on cultural equivalence (based on Bowden & Fox-Rushby, 2003; Herdman et al., 1998; Terwee et al., 2007; and Mokkink et al., 2010)

General information

  • Name of instrument

  • Initial study (language)

  • ° Authors

  • ° Journal

  • ° Article title

  • ° Location

  • ° Disease/condition (and intervention) studied

  • Cross-cultural validation study (language)

  • ° First author

  • ° Journal

  • ° Article title

  • ° Location

  • ° Disease/condition (and intervention) studied

Methodological details

  • Sample characteristics

  • Sample size

  • Sampling frame

  • Method of selection

  • Aim of study

  • Other measures used during the study

Conceptual equivalence

  • °In what ways were the local populations’ conceptualizations of participation assessed?

  •  Local literature

  • ° Local questionnaires/instruments

  • ° Discussion amongst researchers

  • ° Involvement of anthropologists, sociologists, etc.,

  • ° Discussion with local people

  • ° Other

‘Local population’s’ conceptualization was rated positive if 50% of the subcategories received a positive rating (three out of six).

  • Were any people of the target population asked to judge the appropriateness of the instrument; was a detailed discussion provided in the article concerning the appropriateness of the instrument or were the domains of importance identified by the local people covered in the instrument?

  • Were any theoretical arguments presented questioning or accepting conceptual equivalence?

  • ° Conceptual framework described in relation to the local concept under investigation

  • ° Definition of the main construct

  • ° Discussion of possible between-group differences related to construct

  • ° Discussion of possible cultural differences related to the construct

‘Theoretical arguments’ was rated positive if 2 out of 4 subcategories received a positive rating.

Conceptual equivalence was rated ‘extensive’ if two out of three categories were rated positively. The rating ‘partial’ was assigned if one out of the three categories received a positive rating. If no or minimal information was provided concerning conceptual equivalence a ‘none/minimal rating was given.

Item equivalence

  • Does the report mention how the authors assessed the relevance and acceptability of the individual items for the target population?

  • Are the relevancy and acceptability of items discussed in the light of any quantitative or qualitative analyses?

  • Were any adaptations necessary and was this discussed properly regarding individual items?

Item equivalence was rated ‘extensive’ if two out of three categories were rated positively. The rating ‘partial’ was assigned if one out of the three categories received a positive rating. If no or minimal information was provided concerning item equivalence, a ‘none/minimal’ rating was provided.

Semantic equivalence

  • Were the initial developers of the scale contacted and what was the nature of the contact?

  • Was a translation protocol followed or a user manual including translation instructions?

  • Were any details about the translation procedure provided?

  • ° Description of the translators

  • ° Was the translation procedure adequate? (translation and back translation, native speakers, with and without knowledge of the particular topic)

  • ° Was the translation checked with the target population?

  • ° Was the translation quality judged by experts or researchers?

A positive rating was provided for ‘translation procedure’ if at least two out of the four subcategories were rated as positive.

  • Was the initial meaning of key words and phrases investigated and if yes, how was this done?

  • Were there any problems or difficulties reported with the translation?

Semantic equivalence was rated ‘extensive’ if at least three out of five categories were rated positively. The rating ‘partial’ was assigned if two out of five categories received a positive rating. If no or minimal information was provided concerning semantic equivalence a ‘none/minimal’ rating was provided.

Operational equivalence

  • What was the percentage missing data and what action was taken if the percentage was too high (>25% per item)?

  • Was the same administration format used?

  • ° Was a description provided about the literacy rates or educational level of the target population?

  • ° Was the suitability of the questionnaire format discussed?

  • ° Was the appropriateness of the item format evaluated and discussed?

  • ° Was the appropriateness of the response options evaluated and discussed?

  • ° Were instructions for interviewers available?

A positive ratings was provided for ‘administration format if at least two out of the five subcategories were rated as positive.

  • Was the instrument pre-tested before use?

Operational equivalence was rated ‘extensive’ if all three categories were rated positively. The rating ‘partial’ was assigned if at least one or two out of three categories received a positive rating. If no or minimal information was provided concerning semantic equivalence a ‘none/minimal’ rating was provided.

Measurement equivalence

How was content validity addressed?

  • ° Is the measurement aim of the instrument described?

  • ° Is the target population described?

  • ° Are the concepts described that the instrument intend to measure?

  • ° Were the target population and researchers or experts involved during item selection and reduction? (Often not applicable during cross-cultural validation)

  • A positive rating was provided for ‘content validity’ if at least two out of the four subcategories were rated as positive. How was construct validity of the instrument assessed?

  • ° Were hypotheses formulated a priori and was the expected magnitude range and direction of the expected association stated?

  • ° Was factor analysis applied on an adequate sample size (at least seven times the number of items)?

Construct validity was rated as positive if one out of the two subcategories was addressed.

  • Was test-retest reliability and agreement assessed?

  • ° How was intra or inter-interviewer reliability assessed and were the results found adequate (Intraclass correlation coefficients (ICCs) ≥0.70 or weighted kappa ≥0.70)?

  • ° Showed the scale adequate internal consistency (Cronbach’s alpha at least 0.70)?

  • ° Were adequate agreement measures provided (e.g. Smallest Detectable Change, Minimal Important Change)

Test-retest reliability was rated positive if one out of the two subcategories was addressed.

  • Were any floor or ceiling effects tested (<15%)?

  • How was interpretability assessed and were the results found adequate (at least means and standard deviations of four subgroups provided and/or a Minimally Important Change defined)?

  • How was responsiveness assessed, were the methods applied adequate, as well as the results found?

  • Were any Item Response Theory (IRT) methods applied (e.g. Rasch analysis)?

Measurement equivalence was rated as extensive if at least four out of seven categories were rated positively. The rating partial was assigned if two of the seven categories received a positive rating. If no or minimal information was provided concerning measurement equivalence a none/minimal rating was provided.

The subcategories were rated positive (+), negative (-), no information available (0) or indeterminate (+/-) (inadequate design or methods used).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 374.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.