7,724
Views
9
CrossRef citations to date
0
Altmetric
Assessment Procedures

Structural validity and construct validity of the Dutch-Flemish PROMIS® physical function-upper extremity version 2.0 item bank in Dutch patients with upper extremity injuries

, &
Pages 1176-1184 | Received 09 Apr 2019, Accepted 31 Jul 2019, Published online: 14 Aug 2019

Abstract

Introduction

Aim of this study was to validate the Dutch–Flemish Patient-Reported Outcomes Measurement Information System Physical Function – Upper Extremity version 2.0 item bank in patients with upper extremity injuries.

Materials and methods

Cross-sectional study. Structural validity was assessed using Confirmatory Factor Analysis examining unidimensionality. In addition, a bi-factor model was fitted. Internal consistency was assessed by Cronbach’s alpha. Construct validity was examined by assessing correlations with legacy instruments Disability of Arm Shoulder and Hand, Patient Reported Wrist Evaluation and Michigan Hand Questionnaire subscale Activities in Daily Life.

Results

A total of 303 patients (144 female) with mean age of 50 years (standard deviation 18) were included. Confirmatory Factor Analysis showed Comparative Fit Index of 0.94, a Tucker Lewis Index of 0.93, a Root Mean Square Error of Approximation of 0.12 and a Standardized Root Mean Residual of 0.09. Factor loadings were all above 0.70. Bifactor analysis showed an omega-H of 0.79 and Explained Common Variance of 0.67. The correlations with the legacy instruments were as expected or higher than expected.

Conclusion

The Dutch-Flemish Patient-Reported Outcomes Measurement Information System Physical Function – Upper Extremity version 2.0 item bank measures a unidimensional trait and sufficient construct validity was found.

    IMPLICATIONS FOR REHABILITATION

  • Completing Patient Reported Outcomes is time-consuming for patients and interpretability of outcomes is sometimes unclear due to some variation in psychometric properties.

  • Computerized Adaptive Testing reduces the burden for patients by using an algorithm which decreases the amount of questions that need to be answered to 4 to 7 items.

  • The Dutch-Flemish Patient-Reported Outcomes Measurement Information System Physical Function – Upper Extremity version 2.0 item bank measures a unidimensional trait and has sufficient structural validity, internal consistency and construct validity.

  • After calibration of the Patient-Reported Outcomes Measurement Information System Physical Function – Upper Extremity version 2.0, the item bank is operable to use with Computerized Adaptive Testing.

Introduction

Every year many people suffer from musculoskeletal injuries. A recent study performed in the Netherlands showed a prevalence varying from 20 to 56% [Citation1]. Especially, upper extremity injuries form a major problem for society. They have a lot of impact on physical health, but also on work, daily activities, participation, and health care costs. Huisstede et al. showed that the prevalence of upper extremity injuries vary between 1.6 and 53% [Citation2]. Although exact prevalence numbers are lacking, probably due to different ways of defining injuries of the upper extremity and the differences in rehabilitation strategies [Citation2].

In daily clinical practice, patients’ rehabilitation outcome after suffered upper extremity injuries is objectified using clinical measurements, e.g., grip strength, range of motion, and radiological parameters. Other aspects such as pain, activity limitations, and participation restrictions are not being taken into account by these traditional methods [Citation3]. Nowadays, Patient-Reported Outcomes (PROs) are used more frequently, to consider both patient- and expert opinion-based outcomes. Using PROs improve communication between patient and expert, which improves treatment and rehabilitation strategies [Citation4,Citation5].

Most frequently used PROs for upper extremity injuries include the Disability of the Arm, Shoulder and Hand (DASH) [Citation6] questionnaire, the Patient-Rated Wrist Evaluation (PRWE) questionnaire [Citation7], and the Michigan Hand Outcomes Questionnaire (MHQ) [Citation8]. In general, there is some variation in the psychometric properties, and the concepts measured are not always well defined [Citation9–12]. In addition, completing PROs is time-consuming for patients, and interpretability of the outcomes is sometimes unclear [Citation9–12].

Because of these challenges, the Patient-Reported Outcomes Measurement Information System (PROMIS) was initiated by six US research centers and the National Institutes of Health. PROMIS contains a series of item banks with items focused on function or pain, to measure outcomes from the patients’ perspective across medical conditions [Citation13]. The goal was to improve measurement quality and comparability of PROs and reduce patients’ burden. Item banks were developed and validated for measuring specific symptoms and health status domains [Citation13,Citation14].

The PROMIS Psychical Function item bank has been developed, validated and calibrated in the English language [Citation15,Citation16]. To use the PROMIS item bank in the Netherlands and Belgium, the item bank was translated into the Dutch-Flemish language by the Dutch-Flemish PROMIS Group [Citation17]. The first version of the Dutch-Flemish PROMIS Physical Function item bank (v1.2) contained 121 items [Citation15–17]. This item bank has been validated in Dutch patients with chronic pain, patients undergoing physiotherapy, and patients with rheumatoid arthritis [Citation18–20]. The PROMIS Group continued to improve the item bank, and a separate item bank for upper extremity injuries was developed, containing 46 items; the PROMIS® Physical Function – Upper Extremity (UE) v2.0 item bank [Citation21,Citation22]. Eventually, the item bank will be used as a Computerized Adaptive Test (CAT). The CAT system uses an algorithm that selects questions from the item bank based on the patients’ response to the previous questions. When a predefined precision is reached, the system automatically stops asking questions. The benefit of a CAT is that the number of questions that need to be asked can be reduced to between 4 and 7 items [Citation23].

The aim of this study was to validate the Dutch Flemish PROMIS UE v2.0 item bank. Structural validity and internal consistency must be sufficient before the item bank can be used as CAT. This ensures incorporating the patients’ perspective on outcome following upper extremity injury and rehabilitation.

Materials and methods

Design

For this cross-sectional study, patients with upper extremity injuries were recruited at the outpatient clinic of a level 1 traumacenter in the Netherlands from May until July 2018. All patients were treated, conservatively or surgically. Online written informed consent was obtained.

Inclusion criteria were: patients having an injury of the upper extremity, age of 18 years or older at the moment of completing the questionnaire and sufficient understanding of the Dutch language in reading and writing. Exclusion criteria were: no sufficient knowledge of the Dutch language. Uncompleted questionnaires were not included in the analysis.

Dispensation for medical ethical approval was granted by the local Medical Ethics Committee [2018.259]. In addition, the study was performed in compliance with the principles outlined in the Declaration of Helsinki on ethical principles for medical research involving human subjects [Citation24].

Methods of measurement

The international PROMIS guidelines for instrument development and validation were used, which serves as the scientific foundation for questionnaire development and validation [Citation25]. The guideline follows the following structure: First, translation has to take place. Second, cognitive debriefing needs to be performed to ensure understanding and readability of the translated questions. Third, validation should be performed [Citation25].

Procedure

Patients were requested to complete an online questionnaire, containing 4 questionnaires (108 items in total): the Dutch-Flemish PROMIS UE v2.0 item bank (containing 46 items), the DASH (containing 30 items), the PRWE (containing 15 items) and the MHQ subscale Activities in Daily Life (MHQ-ADL, containing 17 items). The complete questionnaire was built in “Survalyzer®”, which is an online survey software program.

Following informed consent, the included patients completed the questionnaires on an electronic device (e.g., iPad, smartphone, computer), during their visit to the outpatient clinic or at home through an email with the link to the web-based questionnaire. Patients who were unable to use an electronic device could complete a paper version of the questionnaire. If applicable, a reminder by email was sent following the initial invitation. The estimated time to complete the online questionnaire was calculated to be about 13 min, based on an average expected response time per item of 7 s (108 × 7 s) [Citation26].

Measurements

The online questionnaire also contained questions addressing demographic and clinical characteristics. Demographic characteristics asked were age, gender, country of birth, expected duration of rehabilitation, and educational level. In addition, if present a pending compensation claim was recorded. Clinical characteristics that were asked were dominance, trauma intensity (mono-trauma or poly-trauma), disease duration at the moment of inclusion and the presence of other pain complaints.

The questionnaire of specific interest was the Dutch-Flemish PROMIS UE v2.0 item bank. The first version (v1.2) of the Physical Function item bank measures general physical function, of both upper and lower extremity and health condition together [Citation16]. A specific item bank focusing on upper extremity limitations was developed. Forty-two relevant items of the v1.2 item bank were reused, and 4 new upper extremity functioning questions were developed and translated into Dutch-Flemish. Following translation, to evaluate the comprehensibility and relevance of the items, cognitive debriefing was performed. Cognitive interviews were conducted for all 45 items with at least 5 native Dutch and at least 5 native Flemish patients and people from the general population (submitted for publication).

The current version of the PROMIS UE v2.0 item bank contains 46 items focusing on limitations in activities that require the upper extremity, with two different 5-point Likert response scales (). Scores of all PROMIS measures are expressed as T-scores, where a score of 50 represents the average of the (US) general population, with a standard deviation (SD) of 10. Higher scores indicate better function.

Table 1. PROMIS physical function - upper extremity v2.0 item bank.

As mentioned earlier, 3 disease specific legacy instruments were used in this study; DASH, PRWE and MHQ-ADL .

The DASH questionnaire contains 30 items, specifically addressed to disabilities and symptoms in injuries of the upper extremity () [Citation6]. The questions use 5-point Likert response scales, ranging from no problems with functioning at all (1 point) until completely unable to function (5 points). The total score ranges from 0 (no disability) to 100 (most severe disability). The timeframe for the items is “during the past week.” Both the primary English DASH questionnaire and the official Dutch translation have acceptable psychometric properties [Citation11,Citation12,Citation27–31].

Table 2. Legacy instruments.

The Patient Rated Wrist Evaluation (PRWE) questionnaire contains 15 items, focused on wrist injury () [Citation7]. Five items are specifically addressed to pain and 10 items to function, divided into specific activities and usual activities. The timeframe for the items is “during the past week.” The items assessing pain are rated from no pain (0) to unbearable pain (10), and the function scale is rated from no disability (0) to most disability (10). Higher scores imply worse outcome. Both the primary English PRWE questionnaire and the Dutch translation have acceptable psychometric properties [Citation3,Citation7,Citation32].

The Michigan Hand Outcomes Questionnaire subgroup Activities in Daily Life (MHQ-ADL). The complete MHQ contains 57 items [Citation8], divided into 6 subgroups; activities of daily living (ADL), overall hand function, pain, work performance, esthetics and patient satisfaction with their hand function. For this study, the subscale MHQ-ADL was chosen, because the ADL subscale is expected to measure the same construct as the Dutch-Flemish PROMIS UE v2.0 item bank (). The MHQ-ADL contains 17 items, focused on hand or wrist injury [Citation8]. Five items for the right hand, 5 items for the left hand and 7 items for both hands, all addressed to activities of daily living. Scores of the injured side were used for calculations. A 5-point Likert response scale is used: Not difficult at all/A little difficult/Somewhat difficult/Moderately difficult/Very difficult. The timeframe for the items is “during the past week.” The total score per scale is converted to a score ranging from 0 to 100. Higher scores indicate less disability. Both the primary English MHQ and the official Dutch translation have good psychometric properties [Citation8,Citation33–38].

Analysis

Sample size

There are varying views and guidelines to determine the sample size for validating questionnaires [Citation39]. The required sample size for Confirmatory Factor Analysis (CFA) was estimated as at least 300 participants, based on recommendations by Comrey and Lee [Citation40].

Structural validity

Structural validity measures the degree to which the scores of a health-related PRO instrument are an adequate reflection of the dimensionality of the construct to be measured [Citation41]. In this study, the structural validity of the Dutch-Flemish PROMIS UE v2.0 item bank was assessed by confirmatory factor analyses (CFA). A single factor model of the Dutch-Flemish PROMIS UE v2.0 item bank was tested.

Because the item bank is supposed to measure one construct (upper extremity physical function), we expect that all items load on a single factor. Unidimensionality was examined by CFA on the polychoric correlation matrix with Weighted Least Squares with Mean and Variance adjustment (WLSMV) estimation. The Comparative Fit Index (CFI), Tucker Lewis Index (TLI), Root Means Square Error of Approximation (RMSEA), and Standardized Root Mean Residual (SRMR) evaluate model fit. We report scaled fit indices, which are considered more exact than unscaled indices [Citation42]. Following the PROMIS analysis plan [Citation23] and recommendations by Hu and Bentler [Citation43] we considered sufficient evidence for unidimensionality and thus adequate model fit if CFI was close to 0.95 or higher, a TLI close to 0.95 or higher, a RMSEA close to 0.06 or less and a SRMR close to 0.08 or less.

If the model did not fit well, a bi-factor model was used to examine if the scale is unidimensional enough for future IRT analyses. To evaluate the influence of multidimensionality a bi-factor model was fitted, and omega-H and Explained Common Variance (ECV) were calculated. A high omega H value indicates that a composite score is reflected by a single common source, for example one common factor underlies item responses [Citation44,Citation45]. The Explained Common Variance (ECV) was calculated, which is the ratio of the variance explained by the general factor, divided by the variance explained by the general factor and the group factors. A high coefficient omega (>0.80) [Citation44] and a high ECV (>0.60) [Citation45] indicate that the risk of biased parameters when fitting multidimensional data into a unidimensional model is low.

In addition, an Exploratory Factor Analysis (EFA) with WLSMV estimation procedures using the R package Psych (version 1.7.5) was performed [Citation46]. The first factor in EFA should account for at least 20% of the variability, and the ratio of the variance explained by the first to the second factor needs to be greater than four [Citation23,Citation47].

Factor loadings were calculated to give a representation of the relationship of each item to the underlying factor. The factor loading is the correlation between the observed score and the latent score. Factor loadings had to be higher than the criterion of >0.50 [Citation40,Citation48,Citation49].

Internal consistency

Internal consistency measures the degree of the interrelatedness among the items [Citation41,Citation50]. The internal consistency of the full item bank as well as the standard 7-item Short Form was determined after conducting a factor analysis. Internal consistency was assessed by calculating Cronbach’s alpha. A Cronbach’s alpha of >0.70 was considered sufficient evidence for internal consistency [Citation51,Citation52].

Construct validity

T-scores were calculated for the Dutch-Flemish PROMIS UE v2.0 item bank, based on response pattern scoring using the US item parameters. The T-scores were correlated (Pearson correlations) to the scores of the legacy instruments. For assessing convergent validity, we hypothesized that the Dutch-Flemish PROMIS UE v2.0 item bank score has a:

  • Hypothesis 1: Strong negative correlation (r −0.50) with the DASH, given the fact that both instruments are supposed to measure related constructs (UE related physical function and UE physical function and symptoms).

  • Hypothesis 2: Moderately strong negative correlation (−0.50 ≤ r ≤ −0.30) with the PRWE function, because both instruments are supposed to measure hand and wrist related activities and function. Because of this expected correlation, we hypothesized that the PRWE function is stronger correlated to the Dutch-Flemish PROMIS UE v2.0 item bank than PRWE pain.

  • Hypothesis 3: Moderately strong positive correlation (0.30 ≤ r ≤ 0.50) with the MHQ-ADL, because both instruments are supposed to measure related constructs of upper extremity related physical function and hand related daily activities.

Construct validity was considered sufficient if at least 75% of the correlations were as expected [Citation52].

Results

Study participants

A total of 524 patients with upper extremity injuries were approached (). A total amount of 405 patients were eligible to participate in this study and gave informed consent, of which 303 (74.8%) completed the online questionnaires (). There were no missing values in the completed questionnaires.

Figure 1. Sample size, inclusions and exclusions.

Figure 1. Sample size, inclusions and exclusions.

Demographic and clinical characteristics are presented in . Of the 303 included patients, 159 were male (52%) and 144 females (48%). The mean age was 50.1 years (SD= 17.5), ranging from 18 until 93 years. An amount of 276 (91%) patients were born in the Netherlands.

Table 3. Demographic and clinical characteristics (n = 303).

Regarding educational levels, 68% had at least a high school degree. The largest group (122 patients, 40%) had achieved a college degree and only 7 patients (2%) had only a primary school degree. After injury, 174 patients (57%) were able to remain their job without major adjustments. About 10% declared to be unemployed, and 11% was reported ill, due to their injury. Of the total group, 50 patients (17%) had a pending (injury) claim (). Most included patients had injuries with an acute or semi acute onset, with a follow-up duration of 1 until 4 months in 30%.

Most reported physical injuries were fractures, tendon injuries and muscle injuries.

Of all injuries, 72% were fractures, mainly distal radius fractures (16%), clavicle fractures (15%) and proximal humeral fractures (12%) (). Conservative treatment was maintained in 217 injured patients (72%). The other 86 patients (28%) needed surgical treatment ().

Figure 2. Types of injury, pie-chart ratio.

Figure 2. Types of injury, pie-chart ratio.

Statistics analysis

Structural validity

With CFA we found a CFI of 0.94, a TLI of 0.93, a RMSEA of 0.10 and a SRMR of 0.09, which was near the reference criteria [Citation23,Citation43–47]. The factor loadings for this model were 0.73 or higher (mean 0.83, and range 0.73 to 0.94), which were all above the criteria of >0.50 [Citation40,Citation48]. A bi-factor model was investigated, to examine if the scale is unidimensional enough for future IRT analyses. Omega-H and ECV were 0.79 and 0.67 respectively, which indicated sufficient unidimensionality. In addition, in EFA three factors with eigenvalues greater than 1 were identified. The eigenvalue of the first factor was 29.6, the eigenvalue of the second and third factors were 2.7 and 2.3, respectively. The ratio of the first to the second factor was 11.0, which is larger than de criterion of 4.

Internal consistency

Cronbach’s alpha for the full Dutch-Flemish PROMIS UE v2.0 item bank was 0.98, which is above the criterion of >0.70 [Citation51]. The Cronbach’s alpha of the PROMIS UE Short Form 7a was 0.90, which is also above the criterion of >0.70.

Construct validity

Mean (SD) T-scores for the PROMIS UE v2.0 were 33.4 (9.1), for the DASH 35.5 (22.1), for the PRWE function 26.0 (15.8), for the PRWE pain 20.9 (14.0), for the PRWE total 46.9 (27.1) and for the MHQ-ADL total (per injured side) 57.4 (34.8) (). The correlations of T-scores the Dutch-Flemish PROMIS UE v2.0 item bank with the T-scores of the DASH, PRWE function, PRWE pain, PRWE total and MHQ-ADL were, –0.84, –0.75, –0.59, –0.74, and 0.73, respectively, with all p values <.001. Only the correlation of the Dutch-Flemish PROMIS UE v2.0 item bank with the DASH met our hypothesis, a strong negative correlation. The correlations between the Dutch-Flemish PROMIS UE v2.0 item bank with PRWE function, PRWE pain, PRWE total and MHQ-ADL were higher than expected.

Table 4. T-scores (n = 303).

Discussion

The aim of this study was to examine the structural validity, internal consistency, and construct validity of the Dutch-Flemish PROMIS UE v2.0 item bank, in order to make it applicable in the outpatient clinic setting. The results show that the Dutch-Flemish PROMIS UE v2.0 item bank measures a unidimensional trait and has sufficient structural validity, internal consistency, and construct validity.

The CFA analyses showed a CFI and TLI of 0.94 and 0.93, which are lower than, but near the criterion of >0.95. A RMSEA of 0.10 and a SRMR of 0.09 were found, which are also near the criteria of <0.06 and <0.08 respectively. The RMSEA was higher than the maximum criterion of <0.06 [Citation43]. Inconsistent results have been found for other versions of the PROMIS Physical Function item bank. Rose et al. reported a RMSEA of about 0.08 for subsets of items from the original English PF v1.2 item bank [Citation16]. The Dutch-Flemish PROMIS Physical Function v1.2 item bank showed a RMSEA of 0.122 [Citation19] and 0.045 [20] in previous studies, respectively. For the Spanish population in the US, a RMSEA of 0.052 was found for the PROMIS Physical Function v1.2 item bank [Citation53]. The RMSEA was not reported for the US PROMIS UE population [Citation54,Citation55]. It has been suggested that traditional cutoffs and standards for CFA fit statistics are not suitable to establish unidimensionality of item banks measuring health concepts [Citation56] and that the RMSEA is sensitive to model complexity (number of estimated parameters) and skewed data distributions [Citation56], the latter being the case in health concepts. Reise et al. have stated that the RMSEA statistic is problematic for assessing unidimensionality of health concepts, and considered the SRMR more promising to determine whether a scale is “unidimensional enough” [Citation45]. The SRMR was 0.09 in our study, slightly higher than the criterion of 0.80. None of the studies on the US PROMIS UE [Citation54,Citation55] or Spanish PROMIS UE item bank [Citation53] reported the SRMR.

All factor loadings for this model were above the criterion of at least 0.50 or higher, with a smallest factor loading of 0.73 [Citation40]. Paz et al. found factor loadings all above the criterion of 0.70 for the Spanish PROMIS Physical Function v1.2 item bank, except for 2 items (PFC7 and PFA19) but these items are not included in the UE item bank [Citation53]. Hays et al. found PROMIS UE factor loadings all above 0.70 as well in a US population, with a smallest factor loading of 0.85 [Citation57]. These results support the hypothesis of unidimensionality of the PROMIS UE v2.0 item bank. Other authors have recommended fitting a bi-factor model and consider the Omega H and ECG when the RMSEA does not fit the criterion of <0.06 [Citation45,Citation53–55]. The omega-H in our study was 0.79, which was just beneath, but very close to the criterion of >0.80, which suggests that a composite score is reflected by a single common source. The ECV was 0.67, which is higher than the criterion of >0.60. Together this suggests that the risk of biased parameters when fitting multidimensional data into a unidimensional model was low.

Finally, in EFA analyses the first factor accounted for more than 20% of the variability and the ratio of the variance explained by the first to the second factor was 11.0. All these results together suggest enough evidence for unidimensionality of the PROMIS UE v2.0 item bank.

Internal consistency

Evidence for sufficient internal consistency was indicated by a Cronbach’s alpha for the entire item bank of 0.98, which is higher than the criterion of >0.70 [Citation41,Citation50,Citation51]. Kaat et al. found an average marginal reliability of 0.90 for the PROMIS UE item bank [Citation54]. Beckman et al. and Paz et al both found a Cronbach’s alpha of 0.99 for the PROMIS UE item bank [Citation21,Citation53]. This suggests that the internal consistency found in this study, was comparable to the internal consistency found in previous studies. The Cronbach’s alpha of the entire item bank might be this high because the scale includes 46 items. A very high Cronbach’s alpha might suggest redundancy of items, but since the entire item bank will not be used in clinical practice, this seems not to be problem. The Cronbach’s alpha of the PROMIS UE Short Form 7a is possibly more relevant, because this short form will be used in clinical practice, instead of the entire item bank. The Cronbach’s alpha of the PROMIS UE Short Form 7a was 0.90. Chung et al. found a Cronbach’s alpha of 0.97 for the MHQ-ADL [Citation8]. For the DASH a Cronbach’s alpha of 0.95 was found by van Eck et al. [Citation31]. Cronbach’s alpha was also calculated by El Moumni et al. for the PRWE total, PRWE pain and PRWE function, which was 0.97, 0.94 and 0.96, respectively [Citation3]. The Cronbach alpha’s found for the DASH, PRWE and MHQ-ADL, were all relatively high, but comparable in terms of meeting the criterion of > 0.70 to the results found for the Dutch-Flemish PROMIS UE v2.0 Short Form 7a in this study. This suggests that evidence for the internal consistency was sufficient, and comparable to the legacy instruments.

Construct validity

Our first hypothesis regarding the correlation between the Dutch-Flemish PROMIS UE v2.0 item bank and the DASH questionnaire was met; there was a strong negative correlation (–0.84). This result was comparable to results from Beckmann et al. [Citation21]. They found a correlation of –0.80 between the US PROMIS UE v2.0 item bank and the DASH [Citation21]. Kaat et al. found a correlation of −0.82, and Doring et al. found a correlation of –0.81 between the PROMIS UE v2.0 item bank and the Quick DASH [Citation54,Citation58]. Overall, this suggests that the PROMIS UE v2.0 item bank and DASH measure similar constructs.

Our second hypothesis regarding the correlation between the Dutch-Flemish PROMIS UE v2.0 item bank and the PRWE total questionnaire was not met. We found a strong negative correlation (–0.74), instead of the expected moderately strong negative correlation (–0.50 ≤ r ≤ −0.30). For the PRWE pain and PRWE function subscales, correlations of –0.59 and –0.75 were found, supporting the hypothesis that the Dutch-Flemish PROMIS UE v2.0 item bank has a stronger correlation with the PRWE function subscale than with the PRWE pain subscale.

Our third hypothesis regarding the correlation between the Dutch-Flemish PROMIS UE v2.0 item bank and the MHQ-ADL was not met as well. A moderately strong positive correlation was expected (0.30 ≤ r ≤ 0.50), though we found a strong positive correlation (0.73). Apparently, the Dutch-Flemish PROMIS UE v2.0 item bank has more content overlap with the PRWE and MHQ-ADL than we expected. This could actually be considered a positive finding because it shows that the Dutch-Flemish PROMIS UE v2.0 item bank is capable of measuring upper extremity related physical function, as well as hand- and wrist related function, comparable to the DASH, PRWE and MHQ-ADL.

Strengths

A sample size of at least 300 participants was achieved, meeting the recommendations by Comrey and Lee [Citation40]. Patients of all ages and with all kind of upper extremity injuries were included, which supports the representativeness of the study population. The main experienced upper extremity injuries were distal radius fractures, clavicle fractures and proximal humeral fractures. A study by Beerekamp et al., performed in the Netherlands, estimated the prevalence of extremity fractures in general [Citation59]. The most commonly reported fractures, were hand and finger fractures (n = 34.144), wrist fractures (n = 25.432) and clavicle and shoulder fractures (n = 13.264) [Citation59]. Hand and finger fractures included carpal, metacarpal and phalangeal fractures together [Citation59]. In our sample the sum of these injuries was 20% (n = 57). These results were comparable to the results of the Beerekamp sample.

This study was conducted according to the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) Risk of Bias-checklist. Adequate study design for cross-sectional validity and construct validity were ensured [Citation51,Citation60]. Besides, the international PROMIS guidelines for instrument development and validation were followed [Citation25].

Weaknesses

Because the hospital the patients were recruited in is a level 1 trauma center, patients with severe and multiple injuries (poly-trauma patients) are overrepresented. In community hospitals, there are less severely injured patients, the mono-trauma patients. These differences can cause bias in the sample and have effect on the representativeness of the study sample. Severely injured patients, often have multiple fractures and corresponding soft-tissue injuries, which has a negative effect on outcome. This might explain a higher amount of severely injured patients with a higher amount of worse outcome in level 1 trauma centers. The implementation of the Dutch-Flemish PROMIS UE v2.0 in outpatient clinical environment might be challenging, due to the need of mobile devices and accessibility to internet. However, almost all patients own a mobile device and internet-based PROs have been implemented worldwide. We therefore think that once an internet-based questionnaire including the Dutch-Flemish PROMIS UE v2.0 has been implemented in your outpatient clinic, it can decrease the burden for the patient and improve interpretation of outcome tremendously.

Clinical interpretation

For daily practice, the Dutch-Flemish PROMIS UE v2.0 item bank is a suitable instrument to measure rehabilitation progress, in comparison with the legacy instruments (DASH, PRWE and MHQ-ADL). The benefit of the Dutch-Flemish PROMIS UE v2.0 item bank is that it can be applied across patient populations, enabling comparison of scores, and it can be used as CAT, which reduces response burden for patients and increases the usability of the Dutch-Flemish PROMIS UE v2.0 item bank in daily practice. Therefore, we recommend future studies to consider using PROMIS instead of the legacy instruments.

Conclusion

This study showed that the Dutch-Flemish PROMIS UE v2.0 item bank measures a unidimensional trait and sufficient structural validity, internal consistency and construct validity were found. Further studies should assess further validation and calibration by IRT analysis, as well as other measurement properties, such as test-retest reliability, measurement error, and responsiveness. After successful IRT analysis, the Dutch-Flemish PROMIS UE v2.0 CAT will be operable to use.

Disclosure statement

One of the authors is president of the PROMIS Health Organization, but receives no compensation for this position.

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

References

  • van der Zee-Neuen A, Putrik P, Ramiro S, et al. Impact of chronic diseases and multimorbidity on health and health care costs: the additional role of musculoskeletal disorders. Arthritis Care Res (Hoboken). 2016;68:1823–1831.
  • Huisstede BM, Bierma-Zeinstra SM, Koes BW, et al. Incidence and prevalence of upper-extremity musculoskeletal disorders. A systematic appraisal of the literature. BMC Musculoskelet Disord. 2006;7:7.
  • El Moumni M, Van Eck ME, Wendt KW, et al. Structural validity of the Dutch version of the patient-rated wrist evaluation (PRWE-NL) in patients with hand and wrist injuries. Phys Ther. 2016;96:908–916.
  • Prinsen CAC, Vohra S, Rose MR, et al. How to select outcome measurement instruments for outcomes included in a “Core Outcome Set” - a practical guideline. Trials. 2016;17(1):449.
  • Snyder CF, Aaronson NK, Choucair AK, et al. Implementing patient-reported outcomes assessment in clinical practice: a review of the options and considerations. Qual Life Res. 2012;21:1305–1314.
  • Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: The DASH (Disabilities of the Arm, Shoulder, and Hand) (vol 29, pg 602, 1996). Am J Ind Med. 1996;30:372.
  • MacDermid JC, Turgeon T, Richards RS, et al. Patient rating of wrist pain and disability: a reliable and valid measurement tool. J Orthop Trauma. 1998;12:577–586.
  • Chung KC, Pillsbury MS, Walters MR, et al. Reliability and validity testing of the Michigan hand outcomes questionnaire. J Hand Surg Am. 1998;23:575–587.
  • Huang H, Grant JA, Miller BS, et al. A systematic review of the psychometric properties of patient-reported outcome instruments for use in patients with rotator cuff disease. Am J Sports Med. 2015;43:2572–2582.
  • Thoomes-de Graaf M, Scholten-Peeters GGM, Schellingerhout JM, et al. Evaluation of measurement properties of self-administered PROMs aimed at patients with non-specific shoulder pain and “activity limitations”: a systematic review. Qual Life Res. 2016;25:2141–2160.
  • Bot SD, Terwee CB, van der Windt DA, et al. Clinimetric evaluation of shoulder disability questionnaires: a systematic review of the literature. Ann Rheum Dis. 2004;63:335–341.
  • Roy JS, MacDermid JC, Woodhouse LJ. Measuring shoulder function: a systematic review of four questionnaires. Arthritis Rheum. 2009;61:623–632.
  • Cella D, Riley W, Stone A, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005-2008. J Clin Epidemiol. 2010;63:1179–1194.
  • Cella D, Yount S, Rothrock N, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) Progress of an NIH roadmap cooperative group during its first two years. Med Care. 2007;45:S3–S11.
  • Rose M, Bjorner JB, Becker J, et al. Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). J Clin Epidemiol. 2008;61:17–33.
  • Rose M, Bjorner JB, Gandek B, et al. The PROMIS Physical Function item bank was calibrated to a standardized metric and shown to improve measurement efficiency. J Clin Epidemiol. 2014;67:516–526.
  • Terwee CB, Roorda LD, de Vet HC. Dutch-Flemish translation of 17 item banks from the Patient-Reported Outcomes Measurement Information System (PROMIS). Qual Life Res. 2014;23:1733–1741.
  • Oude Voshaar MAH, ten Klooster PM, Glas CAW, et al. Calibration of the PROMIS physical function item bank in Dutch patients with rheumatoid arthritis. PLoS One. 2014;9:e92367.
  • Crins MHP, Terwee CB, Klausch T, et al. The Dutch-Flemish PROMIS Physical Function item bank exhibited strong psychometric properties in patients with chronic pain. J Clin Epidemiol. 2017;87:47–58.
  • Crins MHP, van der Wees PJ, Klausch T, et al. Psychometric properties of the PROMIS Physical Function item bank in patients receiving physical therapy. PLoS One. 2018;13:e0192187.
  • Beckmann JT, Hung M, Voss MW, et al. Evaluation of the patient-reported outcomes measurement information system upper extremity computer adaptive test. J Hand Surg Am. 2016;41:739–744 e4.
  • Hung M, Voss MW, Bounsanga J, et al. Examination of the PROMIS upper extremity item bank. J Hand Ther. 2017;30:485–490.
  • Reeve BB, et al. Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Med Care. 2007;45(5): S22–S31.
  • [The Helsinki Declaration of the World Medical Association (WMA). Ethical principles of medical research involving human subjects]. Pol Merkur Lekarski. 2014;36:298–301.
  • PROMIS. PROMIS Instrument Development and Validation Scientific Standards Version 2.0 (revised May 2013); 2013.
  • Van Son MAC, De Vries J, Roukema JA, et al. Health status and (health-related) quality of life during the recovery of distal radius fractures: a systematic review. Qual Life Res. 2013;22:2399–2416.
  • Changulani M, Okonkwo U, Keswani T, et al. Outcome evaluation measures for wrist and hand: which one to choose?. Int Orthop. 2008;32:1–6.
  • Hoang-Kim A, Pegreffi F, Moroni A, et al. Measuring wrist and hand function: common scales and checklists. Injury. 2011;42:253–258.
  • Schoneveld K, Wittink H, Takken T. Clinimetric evaluation of measurement tools used in hand therapy to assess activity and participation. J Hand Ther. 2009;22:221; quiz 236.
  • Veehof MM, Sleegers EJA, van Veldhoven NHMJ, et al. Psychometric qualities of the Dutch language version of the Disabilities of the Arm, Shoulder, and Hand questionnaire (DASH-DLV). J Hand Ther. 2002;15:347–354.
  • van Eck ME, Lameijer CM, El Moumni M. Structural validity of the Dutch version of the disability of arm, shoulder and hand questionnaire (DASH-DLV) in adult patients with hand and wrist injuries. BMC Musculoskelet Disord. 2018;19:207.
  • Brink SM, Voskamp EG, Houpt P, et al. Psychometric properties of the Patient Rated wrist/hand evaluation - Dutch Language Version (PRWH/E-DLV). J Hand Surg Eur Vol. 2009;34:556–557.
  • Chung KC, Hamill JB, Walters MR, et al. The Michigan Hand Outcomes Questionnaire (MHQ): assessment of responsiveness to clinical change. Ann Plast Surg. 1999;42:619–622.
  • Chung BT, Morris SF. Confirmatory factor analysis of the Michigan Hand Questionnaire. Ann Plast Surg. 2015;74:176–181.
  • van der Giesen FJ, Nelissen RG, Arendzen JH, et al. Responsiveness of the Michigan Hand Outcomes Questionnaire-Dutch language version in patients with rheumatoid arthritis. Arch Phys Med Rehabil. 2008;89:1121–1126.
  • Chung BT, Morris SF. Reliability and internal validity of the Michigan hand questionnaire. Ann Plast Surg. 2014;73:385–389.
  • Maia MVDP, de Moraes VY, dos Santos JBG, et al. Minimal important difference after hand surgery: a prospective assessment for DASH, MHQ, and SF-12. Sicot-J. 2016;2:32.
  • van de Ven-Stevens LA, Munneke M, Terwee CB, et al. Clinimetric properties of instruments to assess activities in patients with hand injury: a systematic review of the literature. Arch Phys Med Rehabil. 2009;90:151–169.
  • Hogarty KY, Hines CV, Kromrey JD, et al. The quality of factor solutions in exploratory factor analysis: the influcence of sample size, communality, and overdetermination. Educ Psychol Meas. 2005;65:202–226.
  • Comrey AL, Lee HB. First course in factor analysis. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc; 1992.
  • Mokkink LB, Terwee CB, Patrick DL, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63:737–745.
  • Satorra A, Bentler PM. A scaled difference chi-square test statistic for moment structure analysis. Psychometrika. 2001;66:507.
  • Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling. 1999;6:1–55.
  • Rodriguez A, Reise SP, Haviland MG. Applying bifactor statistical indices in the evaluation of psychological measures. J Pers Assess. 2016;98:223–237.
  • Reise SP, Scheines R, Widaman KF, et al. Multidimensionality and structural coefficient bias in structural equation modeling: a bifactor perspective. Educ Psychol Meas. 2013;73:5–26.
  • Revelle W. Psych: procedures for personality and psychological research; 2017. Software
  • Bentler PM. Comparative fit indexes in structural models. Psychol Bull. 1990;107:238–246.
  • Stevens J. Applied multivariate statistics for the social sciences. 2nd ed. Hillsdale, Michigan: L. Erlbaum Associates; 1992.
  • Peterson RA, A meta-analysis of variance accounted for and factor loadings in exploratory factor analysis. Market Lett. 2000;11:261–275.
  • Cortina JM. What is coefficient alpha? An examination of theory and applications. J Appl Psychol. 1993;78:98–104.
  • Terwee CB, Bot SDM, de Boer MR, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol. 2007;60:34–42.
  • Prinsen CAC, Mokkink LB, Bouter LM, et al. COSMIN guideline for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1147–1157.
  • Paz SH, Spritzer KL, Morales LS, et al. Evaluation of the Patient-Reported Outcomes Information System (PROMIS((R))) Spanish-language physical functioning items. Qual Life Res. 2013;22:1819–1830.
  • Kaat AJ, Rothrock NE, Vrahas MS, et al. Longitudinal validation of the PROMIS physical function item bank in upper extremity trauma. J Orthop Trauma. 2017;31:e321–e326.
  • Gausden EB, Levack AE, Sin DN, et al. Validating the Patient Reported Outcomes Measurement Information System (PROMIS) computerized adaptive tests for upper extremity fracture care. J Shoulder Elbow Surg. 2018;27:1191–1197.
  • Cook KF, Kallen MA, Amtmann D. Having a fit: impact of number of items and distribution of data on traditional criteria for assessing IRT’s unidimensionality assumption. Qual Life Res. 2009;18:447–460.
  • Hays RD, Spritzer KL, Amtmann D, et al. Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank. Arch Phys Med Rehabil. 2013;94:2291–2296.
  • Döring A-C, Nota SPFT, Hageman MGJS, et al. Measurement of upper extremity disability using the patient-reported outcomes measurement information system. J Hand Surg Am. 2014;39:1160–1165.
  • Beerekamp MSH, de Muinck Keizer RJO, Schep NWL, et al. Epidemiology of extremity fractures in the Netherlands. Injury. 2017;48:1355–1362.
  • Mokkink LB, de Vet HCW, Prinsen CAC, et al. COSMIN Risk of Bias checklist for systematic reviews of patient-reported outcome measures. Qual Life Res. 2018;27:1171–1179.