4,468
Views
8
CrossRef citations to date
0
Altmetric
Original Article

Validation of Fatigue Impact Scale with various item sets – a Rasch analysis

, , , , &
Pages 840-846 | Received 23 Sep 2016, Accepted 28 Nov 2017, Published online: 12 Dec 2017

Abstract

Purpose: Fatigue is a symptom in patients with chronic gastrointestinal (GI) and liver diseases. Different instruments have been developed to assess the severity of fatigue and the 40-item Fatigue Impact Scale (FIS) is among the most widely used. Shorter versions of FIS include the 21-item Modified Fatigue Impact Scale (MFIS), and an eight-item version for everyday use. The study aimed to assess construct validity, reliability, and sufficiency of the raw score of the original FIS with 40 items, and examine the sufficiency of the 21 items from the Modified scale and the eight items of the Daily Fatigue Impact Scale (D-FIS), all of which are embedded in the 40-item scale.

Methods: Patients with chronic GI or liver disease (n = 354) completed the FIS with 40 items. The majority (57%) was under the age of 55 years and approximately half were females (48%). Various item sets of FIS were derived which showed fit to the Rasch model.

Results: Local dependency and multidimensionality in FIS and the 21-item Modified scale were resolved with a testlet solution but the D-FIS showed local dependency and multidimensionality and differential item functioning (DIF) still remained. Two new item sets fulfilling unidimensionality and no DIF are suggested, one with 15 items and a six-item scale for daily use. The transformation table shows score-interval scale estimates for all these item sets.

Conclusions: Both the FIS and the Modified scale can be used to measure fatigue albeit requiring some adjustment for DIF. The eight-item D-FIS is more problematic, and its summed score is not valid. Alternative 15- and 6-item versions presented in this paper can offer valid summed scores, and the transformation table allows transformation of raw scores and comparisons across all versions.

    Implications for rehabilitation

  • The Fatigue Impact Scale and the Modified Fatigue Impact Scale can be used to measure fatigue after adjustments for differential item functioning.

  • Alternative 15- and 6-item versions of Fatigue Impact Scale offer valid summed scores. The summed score for the Daily Fatigue Impact Scale is not valid.

  • A transformation table with raw scores and Rasch transformed interval scale metric makes it possible to compare scores derived from the Fatigue Impact Scale, the Modified Fatigue Impact Scale and the proposed 15- and 6-item versions of Fatigue Impact Scale for research and/or clinical use.

Introduction

Fatigue is a common symptom in patients with long-term conditions, including chronic gastrointestinal (GI) [Citation1] and liver diseases [Citation2]. The understanding of fatigue in these disorders is incomplete, but fatigue is considered by many patients to be one of their most debilitating symptoms [Citation3–7]. Within gastroenterology, it is well-known that fatigue is prominent in patients with irritable bowel syndrome (IBS) [Citation8,Citation9], inflammatory bowel disease (IBS) [Citation10] and celiac disease [Citation11], and within hepatology, primary biliary cirrhosis [Citation4,Citation5], nonalcoholic fatty liver disease [Citation12,Citation13], and hepatitis C [Citation14].

Different instruments have been developed to assess the severity of fatigue and its impact on daily life [Citation15,Citation16], and in gastroenterology and hepatology, the Fatigue Impact Scale (FIS) [Citation17] is among the most widely used. It has several versions including the original 40-item scale, a 21-item Modified scale (MFIS), and a short eight-item version for everyday use (D-FIS) [Citation18].

Many studies have looked at the reliability and validity of the different versions of the FIS from a classical psychometric perspective, across a variety of diagnostic groups [Citation19–23]. Modern test theory with Rasch analysis is now widely applied as the quality standard in health sciences in order to get more detailed information about dimensionality of a scale, scale validity and reliability, and sufficiency of using a summed score [Citation24–27]. With Rasch analysis, it is also possible to transform ordinal scale scores into an interval metric [Citation28]. To date, few studies have examined the various scale properties from a modern psychometric perspective, and in a limited number of diagnostic groups [Citation29].

The current study aimed to assess construct validity, reliability, and sufficiency of the raw score of the original FIS with 40 items, and likewise for the 21 items of the MFIS and the eight items of the D-FIS, all of which are embedded in the 40-item scale, using a modern psychometric approach.

Methods

Participants and setting

Data were obtained from two study cohorts from the Gastroenterology and Hepatology Research Unit at Sahlgrenska University Hospital, Gothenburg, Sweden: (1) a cohort of patients fulfilling the Rome III criteria for IBS [Citation30], who participated in a study between 2011 and 2013 assessing the pathophysiology of IBS, including investigations of GI motility, sensitivity and low-grade inflammation of the GI tract [Citation31]; (2) a cohort of patients included between 2010 and 2013 with non-advanced chronic liver disease (Child-Pugh A), equally divided between five diagnoses: primary biliary cirrhosis, primary sclerosing cholangitis, autoimmune hepatitis, chronic hepatitis B and C. These subjects were included in a study to assess mechanisms behind fatigue in chronic liver disease [Citation32]. Subjects in both cohorts completed paper questionnaires during a visit at the research unit specifically for participation in the studies. Characteristics of study participants are shown in .

Table 1. Characteristics of study population, n = 354.

Questionnaires

Fatigue Impact Scale

The FIS is a 40-item instrument assessing functional limitations attributed to fatigue in three domains of daily life: physical functioning (10 items), cognitive functioning (10 items), and psychosocial functioning (20 items) [Citation17]. shows these domains and short item names. Complete item expressions can be found in Fisk et al. [Citation17]. Each item is rated on a five-step Likert scale, where 0 = no problems to 4 = extreme problems during the previous month. Ratings are summed to a total score, ranging from 0 to 160, and domain scores range from 0 to 40 for physical and cognitive functioning and 0–80 for the psychosocial functioning. Higher scores indicate greater limitations in functioning. The Swedish version of the FIS has been psychometrically evaluated in patients with multiple sclerosis and in the general population [Citation33].

Table 2. Comparative item sets of FIS versions based upon the FIS item number.

Modified Fatigue Impact Scale (MFIS)

The MFIS was created during the development of the Multiple Sclerosis Quality of Life Inventory (MSQLI) [Citation34] by shortening the 40-item FIS. The items of the MFIS were selected by the National Multiple Sclerosis Society for use as a part of the assessment of health-related quality of life in patients with MS [Citation35]. This selection resulted in 21 of the original items (physical functioning (nine items), cognitive functioning (10 items), and psychosocial functioning (two items)) [Citation29,Citation36,Citation37] (). Rating scales and scoring procedures are the same as for the FIS. For unclear reasons, the response format on the five-step Likert scale was changed in the MFIS-version from MS Society from rating the intensity of the problems (no problem to extreme problem) to rating their frequency (never to almost always). For the purpose of our study, the response format is identical to the original 40 item FIS.

Daily Fatigue Impact Scale (D-FIS)

The D-FIS is an eight-item version of the FIS designed for daily use in clinical practice. Items were selected from the original pool of FIS items by means of Rasch analysis. It comprises three items from the physical domain, four items from the cognitive domain, and one item from the psychosocial domain [Citation18] (). Rating scales and scoring procedures are the same as for the FIS; however, only a single score is computed.

Rasch analysis

Rasch analysis was used to test the requirements for fundamental measurement [Citation38,Citation39], including stochastic ordering of items (a probabilistic form of Guttman Scaling), local independence of items (conditional upon the trait), unidimensionality [Citation40], ordering of thresholds in polytomous response options [Citation41], and scale invariance across groups (differential item functioning, DIF) which was tested for age, gender, and diagnosis [Citation42]. Full details of the process are given elsewhere [Citation43,Citation44]. Briefly, Item misfit was considered if their χ2 value was below the Bonferroni adjusted Chi-Square level. Residual fit values were considered acceptable within the range ±2.5. The local independence assumption for items was tested by the magnitude of residual correlations, whereby a value of 0.2 above the average residual correlation was considered a breach of the assumption [Citation45]. Unidimensionality was tested post hoc by the procedure introduced by Smith, where the lower confidence interval of the number of significant person-specific estimates derived by contrasting two independents sets of items should be below 5% [Citation46].

Finally, a strategy to deal with clusters of locally dependent items was to group them into testlets, thereby absorbing the impact of local dependency, and re-analyze on this basis [Citation47]. Testlets are thus, in practice, just “super items”, where a set of items are added together to make a new polytomous item with the score range of the summated constituent items. Thus, testlets do not affect the total score range of the scale.

Information from these analyses was used to identify alternative item sets that fulfill the requirements of the Rasch model. Reliability was tested with Cronbach’s alpha. RUMM 2030 (http://www.rummlab.com.au/) was used for Rasch analysis. Descriptive statistics was undertaken with SPSS (Statistical Package for Social Services version 22, SPSS Inc., Chicago, IL).

Ethical approval

Ethical approval for data collection was obtained from the Regional Ethical Review Board in Gothenburg. All participants received verbal and written information, and signed an informed consent form before any study-related procedure took place.

Results

A total of 354 patients, (132 with IBS; 222 with liver disease) completed the FIS 40-item questionnaire from which data for MFIS and D-FIS items were derived. Gender, age, and diagnosis of the patient groups are shown in . Almost three-fifths (57.4%) were under the age of 55, and approximately half of them were female.

Initially, the original FIS data did not fit the Rasch model, extensive local dependency in the item set was observed, and some DIF by age and diagnosis was also found. The scale showed multidimensionality, and misfit across all indicators of fit (: FIS). However, the scoring categories of all individual items were ordered correctly, and reliability (Cronbach’s alpha) was high, consistent with individual (high stakes) use. The patterns of local dependency mapped to the underlying conceptual structure of the physical, cognitive and psychosocial domains. Consequently, the items were grouped into domain-based testlets and the analysis was re-run. This showed improvement in the indicators of fit with a marginal fall in the level of reliability (: FIS-3). This solution yielded a unidimensional scale, but some DIF by age and diagnosis remained. The domains were split by age to accommodate the DIF, and this resulted in a further improvement in fit (: FIS-3 (DIF)). DIF by diagnosis disappeared after this adjustment, suggesting an age-diagnosis interaction. However, the Cognitive and Physical domains were found to be marginally biased in opposite directions for both contextual factors, with low overall differences in mean estimates (Cognitive 0.13 logits; Physical 0.41 logits). After splitting the Physical domain for age, the artificial DIF for age in the Cognitive domain disappeared [Citation48]. Person estimates from the testlet analysis and the analysis with the Physical testlet split for age were compared using the unbiased Cognitive and Psychosocial testlets as anchors. Paired t-test between person estimates showed a non-significant p value (p = 0.62) showing that the effect of DIF canceled out at a test level.

Table 3. Fatigue Impact Scale (FIS): summary of Rasch analysis.

The 21 items of the MFIS, embedded within the FIS, showed that while fit and reliability were acceptable, high levels of local dependency were evident, and significant levels of multidimensionality, together with some DIF (: MFIS). The same domain-based testlet solution as above was implemented, with similar results: local dependency disappeared and a strictly unidimensional scale was evident (: MFIS-3); nevertheless, some DIF remained. The D-FIS had adequate summary fit to the Rasch model; however, significant multidimensionality was observed and half of the items showed local dependency (: D-FIS). DIF by age and diagnosis was also found. This could not be resolved by a testlet solution.

Given the observed redundancy within the item set implied by the local dependency, and the presence of DIF, an attempt was made to identify an alternative version of the scale which satisfied the full expectations of the Rasch model, including unidimensionality and the absence of DIF. Items were eliminated in an iterative fashion and for example, where pairs of locally dependent items were observed, the worst fitting and/or greatest DIF item was removed. Subsequently, a 15-item scale did satisfy such expectations (: FIS15), and a six-item scale was also derived for daily clinical use (: FIS6). shows the graphical representation of the raw score ranges of each scale on the common metric where the floor effect clearly shows for the smaller scales. Note that the % at the floor of each scale increases as the number of items decreases by 6% from the D-FIS to the FIS6. However, irrespective of scale, the majority of those at the floor were those with liver disease. One item (item 1, I feel less alert) had extensive DIF by age and diagnosis, and could not be used, despite its position reflecting the mildest level of fatigue. shows the person-item distribution of the FIS15, where 80% of those at the floor of the scale are those with liver disease. shows the item numbers and short names for the various versions of the scales, based upon the ordering of the 40 items in the FIS.

Figure 1. Operational ranges of raw scores of various FIS versions.

Figure 1. Operational ranges of raw scores of various FIS versions.

Figure 2. Person-item threshold distribution for FIS15.

Figure 2. Person-item threshold distribution for FIS15.

The results allow for a transformation table (Supplementary Table S1) showing for each ordinal raw score its interval scale Rasch estimates, denoted _RM, of fatigue for the FIS, MFIS, FIS15, and FIS6. These have the same operational range as the raw score. In addition, each scale has another transformation to the same metric as the FIS (40 items), denoted_40M. This allows for using the basic _RM transformation of the FIS as a means of comparing different versions. For example, a raw score of 2 on the FIS6, giving a FIS6_40M value of 51.6, can be seen to be equivalent to a raw score of 18 on the FIS (where its FIS_RM value is also 51.6). The nearest 40_M equivalent score should be taken in such circumstances. For example, a raw score of 11 on the MFIS (with a value of 52.4 on the MFIS_40M) would be equivalent to a score of 19 on the FIS (which as a value of 52.5), see Supplementary Table S1.

Discussion

Fatigue is one of the most disabling symptoms in patients with GI and liver diseases. In these diagnostic groups valid and reliable yet short easy-to-use instruments for assessing fatigue are useful [Citation1–5,Citation8–12,Citation14]. This study examined construct validity, reliability (Cronbach’s alpha), and sufficiency of using a summed score of existing versions of the FIS with 40, 21 or eight items using Rasch analysis and new item sets with 15 and 6 items are suggested for use in clinical research and in clinical settings. The derived MFIS had the same response options as the FIS.

The FIS was heavily locally dependent, with only 11 out of 40 items not residually correlated with at least one other. That is, after the fatigue score is conditioned out, there still remained significant correlations between items, a breach of the local independence assumption required for summative scales. Several items displayed residual correlations with 5–7 other items. These residual correlations were observed to reflect the underlying conceptual structure of the Cognitive, Physical and Psychosocial domains suggesting an association between the apparent high level of multidimensionality and the local response dependence. While local dependency can be accommodated by the testlet approach, it does raise questions about the efficiency of the item set and the potential redundancy therein [Citation17]. Presumably, this was one of the reasons why the MFIS [Citation36] was developed. Unfortunately, the MFIS was also found to be heavily affected by local dependency and resultant multidimensionality, but again this could be resolved by the testlet approach. As such, both of these original scales have considerable redundancy within their respective item sets and at the item level, this would spuriously inflate their reliability. The small magnitude DIF that could be detected was associated between an interaction of age and diagnosis, and once age was accounted for, the diagnosis-related DIF disappeared and therefore should not have a negative impact when the whole scale is used. The DIF for age disappeared at a test level.

The D-FIS was more problematic, showing local response dependency (half of items) and multidimensionality, suggesting that this may not be the most appropriate set of items for a short form of this nature. Thus two further scales (FIS15 and FIS6) were resolved in which the simple summed scale met all the requirements of the Rasch model, including unidimensionality and invariance by age, gender, and diagnosis.

Given that almost all items had ordered thresholds, there was no need for rescoring of items, and thus a straightforward transformation table could be made from the familiar total raw scores of the FIS and MFIS, together with the alternative FIS15 and the FIS6, to both Rasch metric (_RM) with the same range as the raw score, and a common metric based upon the FIS (_40M). After transformation of raw scores to Rasch interval level metric parametric statistical analysis can be used [Citation49,Citation50]. Under these circumstances, the choice of scale version should be led by clinical/epidemiological requirements. In the current cohorts, even the FIS showed an 11% floor effect, whereas it rose to 17% in the D-FIS and 20% for the FIS6. In a clinical sample which experiences considerable fatigue, this may not be an issue as most patients will be well above the floor of these scales. There is also the worrying aspect that the item which represents the entry point into fatigue for the FIS and MFIS (item 1, I feel less alert), and so reduces the floor effect, was found to have considerable bias. Furthermore, score steps at the extremes of the Rasch metric are larger than at the center, indicating that differences in raw scores must be interpreted with caution [Citation51]. This has also been demonstrated in other fatigue scales [Citation52].

That previous modern psychometric evaluations of the scales have almost always indicated lack of fit to either the Rasch model [Citation29] or classical factor analytic approaches (correlated errors in the classical approach), reflect the lack of attention to the issue of local dependency, a shortfall that has been observed with other scales [Citation53,Citation54]. Delivering misfit and multidimensionality under the Rasch model, or bloated specific factors in classical approaches [Citation55], the presence of local dependency has a corrosive effect upon scale interpretation.

The presence of DIF made it necessary to split the items of the FIS and MFIS to obtain the best solution, suggesting that scale interpretation may be subjected to bias under normal circumstances. The FIS15 is free of such bias, and Supplementary Table S1 provides a transformation from this to the other scales and so may be the better choice while still allowing comparability across scale versions. Unfortunately one item (item 1, I feel less alert) which marked the lower end of the scale (so reducing the floor effect) had extensive DIF by age and diagnosis, and could not be used for the versions which satisfied the Rasch model expectations.

There are several limitations to the study. Only the FIS was administered, rather than each of the other variations. Thus, the interpretation of the construct validity of the other versions is based upon their respective item sets drawn from the FIS. The derived MFIS, therefore also had the same response option as FIS and therefore our data may not be adequate for the commonly used version of MFIS. DFIS was not found to give a valid score and we chose to exclude DFIS scores in the transformation table. DIF could only be tested for age, gender, and diagnosis but not for different duration or severity of illness. Also, DIF could exist between different cultural groups but was not tested in this study. For purposes of this study, fatigue is supposed to be measurable and comparable between different groups of patients. The IBS study was performed in patients seeking health care in a secondary/tertiary care clinic, while the majority of IBS patients are managed in primary care, which limits the generalizability to this population. Likewise, the liver cohort consisted of patients with mild liver disease and no patients with severe liver disease were included, so generalizability of the findings to patients with severe liver disease is not possible. Furthermore, as we were dependent upon the available clinical information, we did not use data on disease duration, nor did we use any information about the non-responders.

Given these limitations, and for the frame of reference of these two conditions, having taken care of much of the disturbance to the Rasch model assumptions through a testlet design, all forms of the FIS, other than the original D-FIS, appear adequate, although the newly introduced FIS15 and FIS6 should avoid any problems with bias by age which may affect estimates of the original scales when item sets are not complete.

Conclusions

FIS and MFIS are widely used fatigue scales and despite some shortcomings, these scales can be used to measure fatigue in patients with GI and liver diseases. However, in this paper, we present and recommend alternative 15- and 6-item versions of FIS which can offer valid summed scores with a restricted number of items compared to the 40-item FIS. FIS15 is shorter than FIS with a slightly larger floor effect. FIS6 has a valid sum score and is suitable for daily use. The D-FIS is not valid and should not be used. The transformation table allows raw ordinal scores to be converted to interval equivalent scores for use in parametric statistical analyses, and for comparisons across versions.

Supplemental material

Anna_Dencker_et_al_supplemental_content.zip

Download Zip (50.1 KB)

Disclosure statement

The authors declare no conflicts of interest.

References

  • Simren M, Svedlund J, Posserud I, et al. Predictors of subjective fatigue in chronic gastrointestinal disease. Aliment Pharmacol Ther. 2008;28:638–647.
  • Jones EA. Fatigue complicating chronic liver disease. Metab Brain Dis. 2004;19:421–429.
  • Maxton DG, Morris JA, Whorwell PJ. Ranking of symptoms by patients with the irritable bowel syndrome. BMJ. 1989;299:1138.
  • Cauch-Dudek K, Abbey S, Stewart DE, et al. Fatigue in primary biliary cirrhosis. Gut. 1998;43:705–710.
  • Huet PM, Deslauriers J, Tran A, et al. Impact of fatigue on the quality of life of patients with primary biliary cirrhosis. Am J Gastroenterol. 2000;95:760–767.
  • Frandemark A, Jakobsson Ung E, Tornblom H, et al. Fatigue: a distressing symptom for patients with irritable bowel syndrome. Neurogastroenterol Motil. 2017;29:e12898.
  • Kalaitzakis E, Josefsson A, Castedal M, et al. Factors related to fatigue in patients with cirrhosis before and after liver transplantation. Clin Gastroenterol Hepatol. 2012;10:174–181, 181.e1.
  • Simren M, Abrahamsson H, Svedlund J, et al. Quality of life in patients with irritable bowel syndrome seen in referral centers versus primary care: the impact of gender and predominant bowel pattern. Scand J Gastroenterol. 2001;36:545–552.
  • Piche T, Huet PM, Gelsi E, et al. Fatigue in irritable bowel syndrome: characterization and putative role of leptin. Eur J Gastroenterol Hepatol. 2007;19:237–243.
  • Czuber-Dochan W, Ream E, Norton C. Review article: description and management of fatigue in inflammatory bowel disease. Aliment Pharmacol Ther. 2013;37:505–516.
  • Siniscalchi M, Iovino P, Tortora R, et al. Fatigue in adult coeliac disease. Aliment Pharmacol Ther. 2005;22:489–494.
  • Newton JL, Pairman J, Wilton K, et al. Fatigue and autonomic dysfunction in non-alcoholic fatty liver disease. Clin Auton Res. 2009;19:319–326.
  • Price JK, Srivastava R, Bai C, et al. Comparison of activity level among patients with chronic liver disease. Disabil Rehabil. 2013;35:907–912.
  • Kallman J, O'Neil MM, Larive B, et al. Fatigue and health-related quality of life (HRQL) in chronic hepatitis C virus infection. Dig Dis Sci. 2007;52:2531–2539.
  • Smets EM, Garssen B, Bonke B, et al. The Multidimensional Fatigue Inventory (MFI) psychometric qualities of an instrument to assess fatigue. J Psychosom Res. 1995;39:315–325.
  • Piper BF, Dibble SL, Dodd MJ, et al. The revised Piper Fatigue Scale: psychometric evaluation in women with breast cancer. Oncol Nurs Forum. 1998;25:677–684.
  • Fisk JD, Ritvo PG, Ross L, et al. Measuring the functional impact of fatigue: initial validation of the fatigue impact scale. Clin Infect Dis. 1994;18:S79–S83.
  • Fisk JD, Doble SE. Construction and validation of a fatigue impact scale for daily administration (D-FIS). Qual Life Res. 2002;11:263–272.
  • Amtmann D, Bamer AM, Noonan V, et al. Comparison of the psychometric properties of two fatigue scales in multiple sclerosis. Rehabil Psychol. 2012;57:159–166.
  • Learmonth YC, Dlugonski D, Pilutti LA, et al. Psychometric properties of the fatigue severity scale and the modified fatigue impact scale. J Neurol Sci. 2013;331:102–107.
  • Miranda-Pettersen K, Morais-de-Jesus M, Daltro-Oliveira R, et al. The fatigue impact scale for daily use in patients with hepatitis B virus and hepatitis C virus chronic infections. Ann Hepatol. 2015;14:310–316.
  • Schiehser DM, Ayers CR, Liu L, et al. Validation of the modified fatigue impact scale in Parkinson's disease. Parkinsonism Relat Disord. 2013;19:335–338.
  • Schiehser DM, Delano-Wood L, Jak AJ, et al. Validation of the modified fatigue impact scale in mild to moderate traumatic brain injury. J Head Trauma Rehabil. 2015;30:116–121.
  • Rasch G. Probabilistic models for some intelligence and attainment tests. Denmark: Danish Institute for Educational Research; 1960.
  • Perline R, Wright BD, Wainer H. The Rasch model as additive conjoint measurement. Appl Psychol Meas. 1979;3:237–255.
  • Tennant A, McKenna SP, Hagell P. Application of Rasch analysis in the development and application of quality of life instruments. Value Health. 2004;7:S22–S26.
  • Leung YY, Png ME, Conaghan P, et al. A systematic literature review on the application of Rasch analysis in musculoskeletal disease – a special interest group report of OMERACT 11. J Rheumatol. 2014;41:159–164.
  • Wright BD, Linacre JM. Observations are always ordinal; measurements, however, must be interval. Arch Phys Med Rehabil. 1989;70:857–860.
  • Mills RJ, Young CA, Pallant JF, et al. Rasch analysis of the modified fatigue impact scale (MFIS) in multiple sclerosis. J Neurol Neurosurg Psychiatry. 2010;81:1049–1051.
  • Longstreth GF, Thompson WG, Chey WD, et al. Functional bowel disorders. Gastroenterology. 2006;130:1480–1491.
  • Le Neve B, Brazeilles R, Derrien M, et al. Lactulose challenge determines visceral sensitivity and severity of symptoms in patients with irritable bowel syndrome. Clin Gastroenterol Hepatol. 2016;14:226–233.e1–3.
  • Ekerfors U, Sunnerhagen KS, Westin J, et al. Muscle performance and fatigue in non-advanced chronic liver disease. Gastroenterology. 2015;148:S991.
  • Flensner G, Ek AC, Soderhamn O. Reliability and validity of the Swedish version of the Fatigue Impact Scale (FIS). Scand J Occup Ther. 2005;12:170–180.
  • Ritvo PG, Fischer JS, Miller DM, et al. MSQLI. Multiple sclerosis quality of life inventory: a user's manual. New York (NY): The Consortium of Multiple Sclerosis Centers Health Services Research Subcommittee; 1997.
  • Fischer JS, LaRocca NG, Miller DM, et al. Recent developments in the assessment of quality of life in multiple sclerosis (MS). Mult Scler. 1999;5:251–259.
  • Flachenecker P, Kümpfel T, Kallmann B, et al. Fatigue in multiple sclerosis: a comparison of different rating scales and correlation to clinical parameters. Mult Scler. 2002;8:523–526.
  • Tellez N, Rio J, Tintore M, et al. Does the modified fatigue impact scale offer a more comprehensive assessment of fatigue in MS? Mult Scler. 2005;11:198–202.
  • Karabatsos G. The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. J Appl Meas. 2001;2:389–423.
  • Newby VA, Conner GR, Grant CP, et al. The Rasch model and additive conjoint measurement. J Appl Meas. 2009;10:348–354.
  • Gustafsson JE. Testing and obtaining fit of data to the Rasch model. Br J Math Stat Psychol. 1980;33:205–233.
  • Andrich D. An expanded derivation of the threshold structure of the polytomous Rasch model that dispels any “Threshold Disorder Controversy”. Educ Psychol Meas. 2013;73:78–124.
  • Teresi JA, Kleinman M, Ocepek-Welikson K. Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures. Stat Med. 2000;19:1651–1683.
  • Pallant JF, Tennant A. An introduction to the Rasch measurement model: an example using the Hospital Anxiety and Depression Scale (HADS). Br J Clin Psychol/Br Psychol Soc. 2007;46(Pt 1):1–18.
  • da Rocha NS, Chachamovich E, Fleck MPD, et al. An introduction to Rasch analysis for Psychiatric practice and research. J Psychiatr Res. 2013;47:141–148.
  • Christensen KB, Makransky G, Horton M. Critical values for Yen’s Q 3: identification of local dependence in the Rasch model using residual correlations. Appl Psychol Meas. 2017;41:178–194.
  • Smith EV Jr. Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. J Appl Meas. 2002;3:205–231.
  • Wainer H, Kiely GL. Items clusters and computerized adaptive testing – a case for testlets. J Educ Meas. 1987;24:185–201.
  • Andrich D, Hagquist C. Real and artificial differential item functioning. J Educ Behav Stat. 2012;37:387–416.
  • Christensen KB, Kreiner S. Monte Carlo tests of the Rasch model based on scalability coefficients. Br J Math Stat Psychol. 2010;63:101–111.
  • Johansson S, Kottorp A, Lee KA, et al. Can the Fatigue Severity Scale 7-item version be used across different patient populations as a generic fatigue measure—a comparative study using a Rasch model approach. Health Qual Life Outcomes. 2014;12:24.
  • Tennant A. Goal attainment scaling: current methodological challenges. Disabil Rehabil. 2007;29:1583–1588.
  • Dencker A, Sunnerhagen KS, Taft C, et al. Multidimensional fatigue inventory and post-polio syndrome – a Rasch analysis. Health Qual Life Outcomes. 2015;13:20.
  • Lundgren NA, Tennant A. Past and present issues in Rasch analysis: the functional independence measure (FIM) revisited. J Rehabil Med. 2011;43:884–891.
  • Küçükdeveci AA, Kutlay Ş, Yıldızlar D, et al. The reliability and validity of the world health organization disability assessment schedule (WHODAS-II) in stroke. Disabil Rehabil. 2013;35:214–220.
  • Cattell RB. Matched determiners vs. factor invariance: a reply to Korth. Multivariate Behav Res. 1978;13:431–448.