1,415
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Improving the measurement properties of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R): deriving a valid measurement total for the calculation of change

ORCID Icon, , ORCID Icon, , , , , & show all
Pages 400-409 | Received 29 Aug 2023, Accepted 19 Feb 2024, Published online: 01 Mar 2024

Abstract

Background

The Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) total score is a widely used measure of functional status in Amyotrophic Lateral Sclerosis/Motor Neuron Disease (ALS), but recent evidence has raised doubts about its validity. The objective was to examine the measurement properties of the ALSFRS-R, aiming to produce valid measurement from all 12 scale items.

Method

Longitudinal ALSFRS-R data were collected between 2013-2020 from 1120 people with ALS recruited from 35 centers, together with other scales in the Trajectories of Outcomes in Neurological Conditions-ALS (TONiC-ALS) study. The ALSFRS-R was analyzed by confirmatory factor analysis (CFA), Rasch Analysis (RA) and Mokken scaling.

Results

No definite factor structure of the ALSFRS-R was confirmed by CFA. RA revealed the raw score total to be invalid even at the ordinal level because of multidimensionality; valid interval level subscale measures could be found for the Bulbar, Fine-Motor and Gross-Motor domains but the Respiratory domain was only valid at an ordinal level. All four domains resolved into a single valid, interval level measure by using a bifactor RA. The smallest detectable difference was 10.4% of the range of the interval scale.

Conclusion

A total ALSFRS-R ordinal raw score can lead to inferential bias in clinical trial results due to its non-linear nature. On the interval level transformation, more than 5 points difference is required before a statistically significant detectable difference can be observed. Transformation to interval level data should be mandatory in clinical trials.

Introduction

Amyotrophic lateral sclerosis (ALS), also known as Motor Neuron Disease (MND), is an incurable, neurodegenerative condition where inexorable progression leads to severe disability. In a meta-analysis of 115 studies involving 55,169 ALS patients over 24 years, change in functional status was found to predict survival (Citation1). It follows that the measurement of change in functioning is crucial not only to follow the progression of the disease, but also to inform clinical management. Furthermore, many trials employ change in functioning as a study endpoint.

In a review of 125 clinical trials, the revised Amyotrophic Lateral Sclerosis Functional Rating Scale (ALSFRS-R) was used in 47 of 51 studies employing a functional rating scale as the primary outcome measure (Citation2). The ALSFRS-R was designed to overcome the weakness of the original which granted disproportionate weighting to limb and bulbar, as compared to respiratory, dysfunction; to correct this the revision added items for dyspnea, orthopnoea, and the need for ventilatory support (Citation3). The ALSFRS-R consists of 12 items each with five levels of severity, the most disabled level is assigned a score of 0 and the least 4, so the total score ranges from 0-48 with a higher score representing better functioning.

Despite its widespread use, concern has been raised about the structural validity of the ALSFRS-R and in particular the validity of the total score. One study using the Rasch model found that the ALSFRS-R failed to satisfy rigorous measurement standards and should be considered as a profile of mean scores from three different domains (Bulbar, Motor and Respiratory functions) rather than a total score (Citation4). Another study found that the interpretation of a total raw score of ALSFRS-R was hampered by ambiguities due to the different metric properties of the three domains aggregated in the scale (Citation5). Two studies found that confirmatory-factor analysis (CFA) supported a four-factor structure (Bulbar, Gross-Motor, Fine-Motor, and Respiratory domains) rather than a total score (Citation6, Citation7). Recently it has been argued that ignoring the multidimensional structure of the ALSFRS-R total score could have negative consequences for ALS clinical trials and that treatment benefit should be analyzed at the subscale level (Citation8). In addition, the use of change scores to assess the efficacy of a treatment to alter functioning, whether from a total or subscale score, can only be validly computed for interval level data (Citation9–12), hence the importance of being able to generate interval level measurement from the ALSFRS-R.

Consequently, this study seeks to examine the measurement structure of the ALSFRS-R in a large population within the framework of both classical (factor analytic) and Rasch Measurement Theory, to verify its measurement properties and determine if an interval scale transformation is viable (Citation13, Citation14).

Methods

Main sample and data collection

Participants with ALS were recruited into the Trajectories of Outcomes in Neurological Conditions-ALS (TONiC-ALS) study from 35 specialist clinics across the United Kingdom between 2013 and 2020 (Citation15). Patient reported outcome measures were collected for depression and anxiety using Modified-Hospital Anxiety and Depression scale (M-HADS) (Citation16); health status using EQ-5D-5L (Citation17); as well as a lay language self-administered ALSFRS-R based on earlier validation work (Citation18). The TONiC-ALS version corresponds with the original Cedarbaum wording and European Network to Cure ALS (ENCALS) and Northeast Amyotrophic Lateral Sclerosis Consortium (NEALS) versions apart from use of lay language and some minor changes, described in the Supplementary File. Severity was graded using the King’s ALS staging system (Citation19). All participants were eligible for follow-up with repeat packs at least 4 months apart. Ethical approval was granted from research committees (reference Citation11/NW/0743).

Calibration, training and validation samples

A calibration sample of 1000 participants was randomized into ‘training’ and ‘validation’ samples of 500 participants for use in the CFA and Rasch analysis (details in Supplementary File).

Confirmatory factor analysis (CFA)

A CFA was applied to both the three- and four- factor solutions. The four-factor solution comprises Bulbar, Gross-Motor, Fine-Motor, and Respiratory domains whereas the three-factor version combines both Motor into a single Limb domain (details in Supplementary File).

Rasch analysis

Data from the ALSFRS-R were also fit to the Rasch measurement model, to evaluate the scale’s construct validity and to test if it was possible to provide interval level latent estimates for parametric analysis (Citation13, Citation14). Due to the known multidimensional nature of the scale, a bi-factor approach was used (Citation20, Citation21). This approach seeks to identify the variance in the data that is common across subscales (i.e., Bulbar, Limb and Respiratory), discarding that which is unique to each subscale. Each subscale is referred to as a ‘testlet’, comprising the summed score of the subscale. Consequently, three testlets are fitted to the Rasch model. In RUMM2030, the software automatically produces a bi-factor solution (Citation14, Citation22). An interval scale latent estimate is then derived from this common variance, often thought of as the ‘first common factor’. The solution must satisfy the full requirements of the Rasch model as described in Supplementary File; each testlet is treated as an item and invariance (Differential Item Functioning) of the scale is tested for age, gender, onset type and duration. Levels of acceptable fit to the Rasch model are provided in the table of results. Where acceptable fit was achieved, a transformation of the raw score total to an interval level metric of 0.0–48.0 is performed. Should an interval level solution not be possible through Rasch analysis, then ordinal level validity would be tested by Mokken scaling (Citation13, Citation14, Citation23).

Table 1. Demographic and clinical characteristics of the different samples.

Additional analyses

Reliability was determined from Cronbach’s alpha and the Person Separation Index (PSI) (details in Supplementary File). Using interval level data, the Standard Error of Measurement (SEM) was calculated as SD*√(1-reliability). The Smallest Detectable Difference (SDD), which provides a value for the minimum difference that must be observed to be sure that any observed difference is real, rather than possibly due to measurement error, was calculated as ±1.96*√2*SEM.

Results

Samples

The demographic and clinical characteristics of the samples are shown in .

Table 2. Fit of ALSFRS-R (sub)scale to the Rasch model. Each sample N = 500.

Confirmatory factor analysis

The three-domain version () failed a CFA in the training sample, even when most items had errors correlated (χ2 123.8 (df 42): p ≤ 0.001; RMSEA 0.062, CFI 0.977, TLI 0.964). This was replicated in the validation sample (χ2 103.4 (df 42): p ≤ 0.001; RMSEA 0.054, CFI 0.983, TLI 0.974).

Figure 1. Schematic diagram of 12 item confirmatory factor analysis.

Figure 1. Schematic diagram of 12 item confirmatory factor analysis.

The four-domain approach also failed a CFA in the training sample () (χ2 361.2 (df 48): p ≤ 0.001; RMSEA 0.114, CFI 0.914, TLI 0.882). Considerable item local dependency existed and, allowing for correlated errors, model fit was poor although approximate fit statistics were improved (χ2 102.6 (df 42): p ≤ 0.001; RMSEA 0.054, CFI 0.983, TLI 0.974). Cross-loading was present suggesting inconsistency of the four-domain structure. The validation sample confirmed this (χ2 116.1 (df 42): p ≤ 0.001; RMSEA 0.059, CFI 0.980, TLI 0.968).

Figure 2. The four domain confirmatory factor analysis.

Figure 2. The four domain confirmatory factor analysis.

Rasch analysis

The data from subscales and the total score were fit to the Rasch model (). A total score achieved satisfactory fit to the model under a bi-factor solution, based upon two testlets, one with the Bulbar and Respiratory subscales, and the other containing Limb (Fine-Motor and Gross-Motor) domains (). There was intermittent lack of invariance (i.e. DIF) across different domains for different grouped factors, but for the total score, only onset type remained. However, controlling for DIF (split and unsplit solution) found an effect size of 0.019, so invariance by onset group was considered trivial and disregarded.

Figure 3. Representation of the bi-factor model as applied within the Rasch measurement framework.

Figure 3. Representation of the bi-factor model as applied within the Rasch measurement framework.

The relationship between raw and interval scores is shown in . This is based on equating each raw score to its logit estimate where data fit the model, which provides a simple transformation based on the raw score. Providing all 12 items of the ALSFRS-R are completed, the ordinal raw total score can be transformed to the interval level metric suitable for parametric analyses, such as change scores. Interval level equivalents can also be read for Bulbar, Gross-Motor, Fine-Motor or Limb subscales, the last combining Gross- and Fine-Motor. A transformation table of raw scores to interval level metrics, based on this solution and subscale specific solutions using the total sample of 1120 subjects, is given in . The Respiratory subscale cannot generate interval level measurement, but gave a Loevinger H Coefficient of 0.752 from Mokken scale analysis, suggesting the raw ordinal subscale score is valid.

Figure 4. Raw score to metric ogive for ALSFRS-R Total.

Figure 4. Raw score to metric ogive for ALSFRS-R Total.

Table 3. Transformation of raw scores to interval scale metric.

Descriptive analysis using interval level metric scores

The age-sex specific baseline estimates in the full sample for the ALSFRS-R interval level scores are given in , along with duration, health status (EQ-5D-5L), and percent with bulbar onset and King’s Stage greater than 2. The overall mean of the ALSFRS-R metric was 25.0 (SD 5.7), equating to raw score of 34 on the transformation table. Standard Error of Measurement (SEM) was 1.80, Smallest Detectable Difference (SDD) was 5.0, which represents 10.4% of the operational scale width. There was little variation over age-sex specific groups.

Table 4. Age-sex specific age, duration, ALSFRS-R interval level (metric) total, and EQ-5D-5L utility value in full sample at baseline (N = 1107).

The interval level total and subscale measures, by onset type, are shown in . As expected, those with bulbar onset had the lowest Bulbar subscale (lower scores indicate worse functioning) and higher Limb function, while those with limb onset had higher Bulbar functioning and lower Limb function. All scales showed a significant difference for onset type, for example, limb onset showed a higher score than respiratory onset (ANOVA F 6.98 (df2); p ≤ 0.001), although the effect size was just 0.144, considered trivial.

Table 5. Total and subscale interval level scores from ALSFRS-R by onset type (N = 1107).

The interval level ALSFRS-R total showed a strong significant gradient across King’s Staging (ANOVA p ≤ 0.001). There was a significant increase in disability (downward gradient) across grouped duration since diagnosis (ANOVA F 33.6 (df 3); p ≤ 0.001). The effect size of the difference in the ALSFRS-R metric across the shortest duration group (<7 months) and the longest duration group (30+ months) was 0.683, rated medium. This was slightly larger than the corresponding effect size for EQ-5D-5L, at 0.615.

Functioning was found to be associated with the level of depression. Those without any depression on the M-HADS-D at baseline had an interval level ALSFRS-R score of 26.26 (SD 5.60) compared with those with probable depression showing worse functioning at 21.73 (SD 5.39) (ANOVA F 58.87 (df 2); p ≤ 0.001). The effect size of this difference was 0.768. A similar, but less strong, effect (0.452) was shown for anxiety.

A multi-level mixed effects regression in the longitudinal data showed that lower ALSFRS-R scores, indicating greater disability, were associated with being female, compared to male, and both bulbar and respiratory onset compared to limb onset (). In addition, the longer the duration, the lower the ALSFRS-R score.

Table 6. Multi-level mixed regression.

Change analysis

Change was investigated for those who had completed their first three questionnaires (‘trilogy’ group) over a period of 18.3 months with an average baseline duration since diagnosis of 23.3 months, together with a subset of these followed-up over 13.6 months with an average baseline duration since diagnosis of 1.4 months (‘inception’ group) (). Baseline levels of ALSFRS-R for both groups were significantly different between the raw score totals and interval measures and likewise the magnitude of change. For the inception group, the average monthly reduction in ALSFRS-R was 0.41 on the interval level metric, and 0.60 on the ordinal raw score. This would equate to a traverse of about 20 points on the interval measure over 48 months (approaching 30 points on the ordinal). The rate of change was higher in the inception group than in the full trilogy group.

Table 7. Changes in ALSFRS-R (ordinal and interval) for those completing baseline and first two follow-ups.

Discussion

Analyzing data from the ALSFRS-R in a large sample of those with ALS initially failed to support the total score from both classical (confirmatory factor analytic) and modern (Rasch analysis) psychometric perspectives. However, applying a bi-factor solution within the Rasch measurement framework generated an interval level total measure, though 18% of the unique variance attributed to the subscales had to be discarded. It was also possible to generate interval level measurement for the Bulbar, Fine- and Gross-Motor subscales, as well as the Limb subscale, which comprises both Fine- and Gross-Motor. The Respiratory subscale cannot generate interval level measurement, but its ordinal score is valid.

These results support previous studies which reported that the ALSFRS-R is multidimensional and that the raw score total should not be used as an endpoint in studies (Citation5, Citation7). Indeed, the raw score total cannot be recommended for decision making in a clinical setting, at either group or individual level, because inherent multidimensionality renders it invalid even at an ordinal level.

Evidence from the current study shows it is valid to use a total measure based on the bi-factor solution, using the transformed metric via the transformation table provided. The two-testlet solution to produce the total score may reflect that fine/gross motor tasks are under conscious control while respiratory and bulbar functions are involuntary.

In addition, several subscales can be used, specifically Bulbar, Fine- and Gross-Motor (the two Motor can be combined into Limb), either at the ordinal level with appropriate statistics, or at the interval level with parametric statistics following transformation as above.

Currently, the risk of bias from using a less-than-interval and multidimensional total ALSFRS-R score is unknown. However, it has been shown that misusing ordinal scales can produce biased outcomes (Citation24, Citation25). The current study demonstrates that for the ALSFRS-R, the (inappropriately) calculated raw score change will underestimate change at the margins of the scale, and overestimate change for the central part of the scale. Consequently, where the change largely occurs within the interquartile range of the scale (i.e. ALSFRS-R raw score range 36–12), then it will overestimate the true (metric) change. This is particularly pertinent as, in the inception cohort, the entry level on the ALSFRS-R at diagnosis was 38.3 on the ordinal. This suggests that any deterioration of functioning would occur over the center of the scale, and thus overestimate the decline in function if based on the ordinal raw score.

showed that relatively few raw score points are lost descending from the score of 48 compared to the metric, so leaving the raw score total much higher than the metric at an equivalent level of functioning. It follows that reporting change scores for total raw ALSFRS-R scores is subject to variability depending on the starting point (Citation26). This illustrates that a 1-point change on the ALSFRS-R raw score corresponds to different quantities of change in functioning depending on the starting point and the subsequent range of change; as a consequence mathematical operations on the raw scores such as the calculation of means, change scores or effect sizes are invalid. This problem can be overcome by using the interval level transformations provided in this study, which permit parametric statistics such as means or change scores. To illustrate the impact of the incorrect use of means of ordinal measures, the invalid mean change on the ordinal from baseline to the second follow up was 8.1 compared to 5.6 on the interval level metric.

Acknowledging the limitations of using ordinal measures, to provide some context against existing publications, the monthly average decline (0.60) in the raw score of the inception group is the same as that reported during the first year after diagnosis in a recent Italian population study (Citation27). It is lower than that reported for trial populations, as these seek to exclude slow progressors whose inclusion would increase the duration or sample size of the study (Citation28, Citation29).

At least two other measures of functioning with interval scale estimates have appeared recently, one an established generic scale (World Health Organization Disability Assessment Schedule 2.0: WHODAS 2.0) (Citation30) with published Rasch transformation (Citation31), and the other a disease-specific scale (Rasch-Built Overall Amyotrophic Lateral Sclerosis Disability Scale: ROADS) (Citation32). Several issues need to be addressed with respect to these scales. Are they measuring the same construct as each other and the ALFRS-R, can their scores be compared to antecedent ALSFRS-R scores for historical comparisons, as well as comparisons of current study outcomes? Both aspects need detailed investigations, but if they are found to be measuring the same construct, Rasch-based strategies can be used to provide a cross-walk between scale estimates (Citation33).

This study has several strengths such as large sample size, use of both a calibration sample to remove time dependency in the estimates derived from the Rasch model, and of training and validation samples to facilitate cross-validation. Both the long period of data collection and the wide range of participating sites should contribute to the generalizability of the findings. Finally, the application of interval scale estimates allows for valid calculations of SEM and SDD, and the use of multi-level regression analysis.

The limitations include the relatively low proportion of those with respiratory onset, even in a large national cohort. The transformation tables should only be used with complete data. However, previous work has indicated that imputation has little effect on fit to the Rasch model, so imputation permits use of the transformation table if item responses are missing (Citation34). ALSFRS-R items such as ventilation partly reflect treatment availability, so healthcare setting might influence use of the scale. Finally, duration used is time since diagnosis.

Future work should include studies of the Minimum Clinically Important Difference (MCID) and comparison of effect sizes between levels of perceived change. In addition, the magnitude of change experienced by patients could be examined with respect to disease progression (Citation35).

The clinical implications are that the use of a raw ALSFRS-R total score in clinical trials, without transformation to interval scaling, may lead to an unknown level of bias (Citation24, Citation25). In routine clinical monitoring, interpretation of raw score total changes may give the wrong impression of a slow decline at the margins of the scale, and a faster decline across the central part of the scale.

Despite several publications demonstrating the substantial limitations to the measurement properties of the ALSFRS-R, it remains a widely used measure of functional status in ALS/MND. The current study has shown that a total ALSFRS-R ordinal raw score could lead to inferential bias in clinical trial results due to its non-linear nature. Following transformation to interval level metric data, a difference of 5 points is required before a statistically significant detectable difference can be observed. Use of the linear transformation should be mandatory in trials.

Data access

Data supporting this study are not openly available due to reasons of sensitivity and are available from the corresponding author upon reasonable request. Data are located in controlled access data storage at Walton Center NHS Trust. Please contact [email protected].

Supplemental material

Supplemental Material

Download PDF (259.3 KB)

Acknowledgments

The authors thank the participants and their families for their invaluable contributions, the research and clinical staff for recruitment, and the TONiC team.

Declaration of interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of this article.

Additional information

Funding

The authors thank Motor Neurone Disease Association UK under Grant: Young/Jan15/929-794, and Neurological Disability Fund 4530 for financial support for participant recruitment, data collection and processing. We thank Biogen Idec for financial support for data analysis. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The authors also thank the NIHR CLRN for research support. For the purpose of Open Access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.

References

  • Su WM, Cheng YF, Jiang Z, Duan QQ, Yang TM, Shang HF, et al. Predictors of survival in patients with amyotrophic lateral sclerosis: A large meta-analysis. EBioMedicine 2021;74:103732.
  • Wong C, Stavrou M, Elliott E, Gregory JM, Leigh N, Pinto AA, et al. Clinical trials in amyotrophic lateral sclerosis: a systematic review and perspective. Brain Commun. 2021;3:fcab242.
  • Cedarbaum JM, Stambler N, Malta E, Fuller C, Hilt D, Thurmond B, et al. The ALSFRS-R: a revised ALS functional rating scale that incorporates assessments of respiratory function. BDNF ALS Study Group (Phase III). J Neurol Sci. 1999;169:13–21.
  • Franchignoni F, Mora G, Giordano A, Volanti P, Chiò A. Evidence of multidimensionality in the ALSFRS-R Scale: a critical appraisal on its measurement properties using Rasch analysis. J Neurol Neurosurg Psychiatry. 2013;84:1340–5.
  • Franchignoni F, Mandrioli J, Giordano A, Ferro S, ERRALS Group A further Rasch study confirms that ALSFRS-R does not conform to fundamental measurement requirements. Amyotroph Lateral Scler Frontotemporal Degener. 2015;16:331–7.
  • Bacci ED, Staniewska D, Coyne KS, Boyer S, White LA, Zach N, Pooled Resource Open-Access ALS Clinical Trials Consortium, et al. Item response theory analysis of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised in the Pooled Resource Open-Access ALS Clinical Trials Database. Amyotroph Lateral Scler Frontotemporal Degener. 2016;17:157–67.,.
  • Bakker LA, Schröder CD, van Es MA, Westers P, Visser-Meily JMA, van den Berg LH. Assessment of the factorial validity and reliability of the ALSFRS-R: a revision of its measurement model. J Neurol. 2017;264:1413–20.
  • van Eijk RPA, de Jongh AD, Nikolakopoulos S, McDermott CJ, Eijkemans MJC, Roes KCB, et al. An old friend who has overstayed their welcome: the ALSFRS-R total score as primary endpoint for ALS clinical trials. Amyotroph Lateral Scler Frontotemporal Degener. 2021;22:300–7.
  • Forrest M, Andersen B. Ordinal scale and statistics in medical research. Br Med J (Clin Res Ed). 1986;292:537–8.
  • Wright BD, Linacre JM. Observations are always ordinal; measurements, however, must be interval. Arch Phys Med Rehabil. 1989;70:857–60.
  • Merbitz C, Morris J, Grip JC. Ordinal scales and foundations of misinference. Arch Phys Med Rehabil. 1989;70:308–12.
  • Stucki G, Daltroy L, Katz JN, Johannesson M, Liang MH. Interpretation of change scores in ordinal clinical scales and health status measures: the whole may not equal the sum of the parts. J Clin Epidemiol. 1996;49:711–7.
  • Rasch G. Probabilistic models for some intelligence and attainment tests. Chicago: University of Chicago Press; 1960.
  • Andrich D. Components of variance of scales with a bifactor subscale structure from two calculations of alpha. Educ Measure. 2016;35:25–30.
  • Young CA, Ealing J, McDermott CJ, Williams TL, Al-Chalabi A, Majeed T, et al. Prevalence of depression in amyotrophic lateral sclerosis/motor neuron disease: multi-attribute ascertainment and trajectories over 30 months. Amyotroph Lateral Scler Frontotemporal Degener. 2022; 24:82–90.
  • Gibbons CJ, Mills RJ, Thornton EW, Ealing J, Mitchell JD, Shaw PJ, et al. Rasch analysis of the hospital anxiety and depression scale (HADS) for use in motor neurone disease. Health Qual Life Outcomes. 2011;9:82–90.
  • EuroQol Research Foundation. EQ-5D-5L User Guide, 2019.
  • Montes J, Levy G, Albert S, Kaufmann P, Buchsbaum R, Gordon PH, et al. Development and evaluation of a self-administered version of the ALSFRS-R. Neurology 2006;67:1294–6.
  • Balendra R, Jones A, Jivraj N, Knights C, Ellis CM, Burman R, et al. Estimating clinical stage of amyotrophic lateral sclerosis from the ALS Functional Rating Scale. Amyotroph Lateral Scler Frontotemporal Degener. 2014;15:279–84.
  • Reise SP, Moore TM, Haviland MG. Bifactor models and rotations: exploring the extent to which multidimensional data yield univocal scale scores. J Pers Assess. 2010;92:544–59.
  • Reise SP, Morizot J, Hays RD. The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Qual Life Res. 2007;16(Suppl 1): 19–31.
  • Andrich D, Sheridan B, Luo G. RUMM2030: an MS Windows computer program for the analysis of data according to Rasch unidimensional models for measurement. Perth, Western Australia: RUMM Laboratory; 2015.
  • Mokken RJ. A theory and procedure of scale analysis with applications in political research. Berlin, New York: De Gruyter Mouton; 1971.
  • Doganay Erdogan B, Leung YY, Pohl C, Tennant A, Conaghan PG. Minimal clinically important difference as applied in rheumatology: an OMERACT Rasch working group systematic review and critique. J Rheumatol. 2016;43:194–202.
  • Ørnbjerg LM, Christensen KB, Tennant A, Hetland ML. Validation and assessment of minimally clinically important difference of the unadjusted Health Assessment Questionnaire in a Danish cohort: uncovering ordinal bias. Scand J Rheumatol. 2020;49:1–7.
  • Grimby G, Tennant A, Tesio L. The use of raw scores from ordinal scales: time to end malpractice? J Rehabil Med. 2012;44:97–8.
  • Mandrioli J, Biguzzi S, Guidi C, Sette E, Terlizzi E, Ravasio A, et al. Heterogeneity in ALSFRS-R decline and survival: a population-based study in Italy. Neurol Sci. 2015;36:2243–52.
  • Thakore NJ, Lapin BR, Pioro EP. Trajectories of impairment in amyotrophic lateral sclerosis: Insights from the Pooled Resource Open-Access ALS Clinical Trials cohort. Muscle Nerve. 2018;57:937–45.
  • van Eijk RPA, Nikolakopoulos S, Roes KCB, Kendall L, Han SS, Lavrov A, et al. Innovating clinical trials for amyotrophic lateral sclerosis: challenging the established order. Neurology 2021;97:528–36.
  • Ustün TB, Chatterji S, Kostanjsek N, Rehm J, Kennedy C, Epping-Jordan J, et al. Developing the world health organization disability assessment schedule 2.0. Bull World Health Organ. 2010;88:815–23.
  • Young CA, Ealing J, McDermott CJ, Williams TL, Al-Chalabi A, Majeed T, et al. Measuring disability in amyotrophic lateral sclerosis/motor neuron disease: the WHODAS 2.0-36, WHODAS 2.0-32, and WHODAS 2.0-12. Amyotroph Lateral Scler Frontotemporal Degener. 2022;24:63–70.
  • Fournier CN, Bedlack R, Quinn C, Russell J, Beckwith D, Kaminski KH, et al. Development and validation of the Rasch-built overall amyotrophic lateral sclerosis disability scale (ROADS). JAMA Neurol. 2020;77:480–8.
  • Prodinger B, Coenen M, Hammond A, Küçükdeveci AA, Tennant A. Scale banking for patient-reported outcome measures that measure functioning in rheumatoid arthritis: a daily activities metric. Arthritis Care Res (Hoboken). 2022;74:579–87.
  • Fellinghauer CS, Prodinger B, Tennant A. The impact of missing values and single imputation upon Rasch analysis outcomes: a simulation study. J Appl Meas. 2018;19:1–25.
  • Prell T, Gaur N, Steinbach R, Witte OW, Grosskreutz J. Modelling disease course in amyotrophic lateral Sclerosis: pseudo-longitudinal insights from cross-sectional health-related quality of life data. Health Qual Life Outcomes. 2020;18:117.