249
Views
21
CrossRef citations to date
0
Altmetric
Original Research

Assessment set for evaluation of clinical outcomes in multiple sclerosis: psychometric properties

, , &
Pages 59-70 | Published online: 11 Oct 2012

Abstract

Purpose:

Multiple sclerosis (MS) manifests itself in a wide range of symptoms. Physiotherapy plays an important role in the treatment of those symptoms connected with mobility. For this therapy to be at its most effective it should be based on a systematic examination that is able to describe and classify damaged clinical functions meaningfully. The purpose of this study was to develop and validate a battery of tests and composite tests that can be used to systematically evaluate clinical features of MS treatable by physiotherapy.

Methods:

The authors assembled a proposed battery of tests comprising known, standard, and validated assessments (low-contrast letter acuity testing; the Motricity Index; the Modified Ashworth Scale; the Berg Balance Scale; scales of postural reactions, tremor, dysdiadochokinesia, and dysmetria; the Nine-Hole Peg Test; the Timed 25-Foot Walk; and the 3-minute version of the Paced Auditory Serial Addition Test) and one test (knee hyperextension) of the authors’ own. Normalization was calculated and six composite assessments were measured. Seventeen ambulatory subjects with MS were tested twice with the assessment set before undergoing physiotherapy, and 12 were also tested with the assessment set after the physiotherapy. The test–retest reliability, stability, internal consistency of composite measurements, sensitivity to changes after therapy, and correlation between measurements and the Kurtzke Expanded Disability Status Scale score were evaluated for all tests in the assessment set.

Results:

A good internal consistency was confirmed for all tests in the proposed battery, and most of the tests also showed good test–retest reliability. While no significant changes occurred without treatment, significant posttreatment improvement was proved in all tests except for low-contrast letter acuity testing, where only a trend to improvement was proved.

Conclusion:

The proposed assessment set is a good tool for the evaluation of clinical features of MS treatable by physiotherapy. This battery of tests is applicable in both clinical practice and research.

Introduction

Multiple sclerosis (MS) is a chronic autoimmune disease pathologically characterized by the presence of areas of demyelination and T-cell perivascular inflammation in the brain white matter, as well as by axonal degeneration. It clinically manifests itself by neurological abnormalities such as fatigue, numbness, paresthesia, muscular weakness and spasticity, double vision, optic neuritis, ataxia, bladder control problems, dysphagia, dysarthria, and cognitive dysfunction.Citation1,Citation2 Physiotherapy plays an important role in the maintenance and improvement of damaged clinical functions.Citation3 However, there is no consensus on what may be the most effective approach to achieve the best possible functionality, given the individual limitations. Contemporary rehabilitation research in MS lacks strict adherence to rigorous methodology and consistent use of a range of clinically appropriate and scientifically sound outcome measures.Citation4,Citation5

Haigh et alCitation6 conducted a survey on instruments commonly used in Europe to measure outcomes for MS patients. A questionnaire was sent to facilities providing rehabilitation (acute settings and rehabilitation units, both publicly and privately funded). Just over 100 outcome measures were reported as being used to assess patients with MS, although the majority of these measures were only used in a small number of centers. (A large number of measures – including the Environmental Status Scale, the Medical Outcomes Study Short-Form General Health Survey, or the Assessment of Motor and Process Skills – were being used in only one location, or a small number of locations, and with relatively few patients.) The Kurtzke Incapacity Status Scale, the Berg Balance Scale (BBS), and the Rivermead Mobility Index were the only measures that were used in more than five centers. The measures used most widely with MS patients were the Kurtzke Expanded Disability Status Scale (EDSS), the Functional Independence Measure, and the Ashworth Scale.

In a review by Khan et alCitation3 of multidisciplinary rehabilitation for MS patients, eight trials fulfilled the selection criteria, and a total of 42 outcome measures were used in these trials.

Based on the examples given, it is clear that the study and assessment of rehabilitation in MS has sparked the development of numerous outcome measures applicable to one or more of the disease’s many dimensions. Outcomes research requires a systematic approach to describe and classify the outcomes meaningfully. The purpose of this study was to develop and validate a battery of tests and composite tests that can be used to systematically evaluate clinical features of MS treatable by physiotherapy.

Aims of this study were:

  • to prepare standard tests for use in the Czech Republic (translation of standard tests and their validation);

  • to validate standard tests for MS (those tests not yet validated for MS);

  • to prepare a battery of tests and composite tests that is systematic, reliable, practical, acceptable to patients, capable of demonstrating rehabilitation effect, and predictive of clinically meaningful change.

Methods

Design of the study

An assessment set comprising 12 tests and six composite tests for the evaluation of clinical outcomes in MS was prepared. Seventeen patients with MS who met the inclusion criteria were selected. An independent neurologist determined the EDSSCitation7 score and duration of the disease. The assessment set was performed twice within 3–5 weeks by an independent physiotherapist. The patients did not change their habits during this time. After the second examination with assessment set, a physiotherapy program, consisting of two 2-hour sessions each week for 2 months, was offered to the patients. Twelve patients finished the physiotherapy program, and these patients were also examined at the end of the program.

Selection and characteristics of the subjects

Seventeen outpatients with the diagnosis of MS according to the criteria of McDonald et alCitation8 (either gender; age range, 30–57 years; suffering from relapsing-remitting, primary progressive, or secondary progressive MS; stability of clinical status in the preceding 3 months; prevailing motor impairment; able to move independently; able to walk at least 200 m with two canes [EDSS score ≤ 5]; able to undergo ambulatory treatment; and right-handed)Citation9 were chosen randomly from MS centers in the Czech Republic. Persons with cognitive impairment that could hinder understanding of the tasks to be accomplished were excluded from enrollment. All patients were required to sign an informed consent document before inclusion in this study. outlines the characteristics of the patients.

Table 1 Characteristics of patients in study

Preparation of assessment set and procedure

The assessment set was prepared from well-known, standard, and validated tests and one test of the authors’ own. The selection of tests was based on team experience and literature review.

The authors included the tests used most frequently in clinical trials of MS: low-contrast letter acuity (L-CLA) testing, the Nine-Hole Peg Test (NHPT), the Timed 25-Foot Walk (T25FW), and the 3-minute version of the Paced Auditory Serial Addition Test (PASAT 3). The authors also included frequently used tests that evaluate the leading problems (eg, spastic paresis, cerebellar symptoms) that therapists deal with in MS patients: the Motricity Index (MI), the Modified Ashworth Scale (MAS), the BBS, and tests for tremor (T), dysdiadochokinesia (DD), and dysmetria (DM). Finally, the authors included tests that evaluate clinical features of MS that, in the authors’ opinion, best react to physical therapy: scales of postural reactions (PRs) and the authors’ own test – knee hyperextension (KH).

The back-translation methodCitation10 was used for the translation of each test.

A trained physiotherapist experienced in performing ambulatory examination administered the scoring of patients. The amount of time required to complete the whole battery of tests was about 1 hour, and the whole assessment was videotaped.

The same examiner performed the consecutive assessments at the same time of day, and preferably at the same day of the week, with the measures administered in the same order each time (all subtests first in lying, then in sitting, then in standing, and, finally, in walking). The examiner used a detailed protocol with precise and standardized instructions. Participants received refreshments.

Tests included in assessment set

To evaluate visual function, a L-CLA scoreCitation11 was used to measure the total number of letters read correctly at three contrast levels (100%, 2.5%, and 1.25%). Visual function was determined as an average of the three contrast levels (each giving a minimum of 0 and a maximum of 60 correct answers).

To evaluate muscle power function (strength), the MICitation12 was used. The MI value for each extremity was determined – the extremity MI includes three actions, each scored between 0 and 33 (where 0 indicates worst muscle power function), which are added together to make a total possible score of 99 plus 1, giving a scale of 1–100.Citation13 The total MI includes 12 items (three items for each extremity), which, added together plus 4, give a scale of 4–400. The three actions for the left and right upper extremities are pinch grip, elbow flexion, and shoulder abduction; the three actions for the left and right lower extremities are ankle dorsiflexion, knee extension, and hip flexion.

To evaluate muscle tone function (spasticity), the MASCitation14,Citation15 was used. The MAS is an 18-item five-point rating scale, each item ranging from 0 to 4 (where 0 indicates no increased tone and 4 indicates limb rigid in flexion or extension). The amount of tone felt as a limb was moved passively through its arc of motion was measured. The MAS score for each extremity was determined: the MAS for upper extremities covers elbow flexors, elbow pronators, elbow supinators, wrist flexors, and digital flexors; the MAS for lower extremities covers hip adductors, knee extensors, knee flexors, and plantar flexors.

To evaluate changing and maintaining a position (balance), the BBSCitation16,Citation17 was used. The BBS is a 14-item five-point rating scale, each item ranging from 0 to 4 (where 0 indicates the lowest level of function), that assesses the performance of functional tasks. Further, PRs (righting, equilibrium, and protective reactions)Citation18 were evaluated from videotape, using a rating scale from 0 to 3 (where 0 indicates only head righting reactions noted and 3 indicates normal reactions – all equilibrium and protective reactions are present),Citation19 in 12 actions (being drawn left and right by another person in a sitting position on a stationary supporting surface; tipped backwards, forwards, left, and right in standing; and steps to save forwards, backwards, left, and right).Citation20

Rest, postural, and intention tremor (T) on the upper and lower extremities was evaluated using a procedure described by Fahn et al.Citation21 This procedure comprises 12 items (three for each extremity) rated on a five-point scale (where 0 indicates none and 4 indicates severe amplitude).

To evaluate DD, a five-point rating scale (where 0 indicates no problem and 4 indicates the subject is unable to perform a repetitive sequential movement) described by Alusi et alCitation22 was used for each extremity. The scores for each extremity were added together for the total DD score on a scale of 0–16.

To evaluate DM, a five-point rating scale (where 0 indicates no impairment and 4 indicates the subject cannot use hands/legs) also described by Alusi et alCitation22 was used for each extremity. The scores for each extremity were added together for the total DM score on a scale of 0–16.

To evaluate the stability of joint function, the authors’ own scale was used. This scale rates genu recurvatum (KH test) ranging from 0 to 6 (where 0 indicates there is no hyperextension in the knee either in standing or in quick walking; 1 indicates there is hyperextension in the knee only during quick walking, and it is voluntarily influenced; 2 indicates there is hyperextension in the knee only during quick walking, but it cannot be influenced voluntarily; 3 indicates there is hyperextension in the knee also during slow walking, and it is voluntarily influenced; 4 indicates there is hyperextension in the knee also during slow walking, but it cannot be influenced voluntarily; 5 indicates there is hyperextension in the knee even in standing, and it is voluntarily influenced; and 6 indicates there is hyperextension in the knee in standing, but it is not influenced voluntarily). The function of the knee was evaluated for both left and right lower extremities and the total KH score was calculated as their average value.

To evaluate fine motor skills, the NHPT,Citation23 a quantitative measure of upper extremities (arm and hand), was used. The NHPT measures the time interval (in seconds) during which a patient places nine pegs into holes in a testing board as fast as possible and then picks them up with one hand, one peg after another, and puts them into a bowl. The duration of the test was limited to 60 seconds. The NHPT was performed twice for each upper extremity and then averaged – this average was calculated as the total NHPT score.

To evaluate walking, the T25FW test,Citation23 which measures maximal walking speed over a distance of 25 feet or 7.6 m from a standing start, was used. The duration of the test was limited to 20 seconds. The score was calculated as the average from two consecutive measurements.

To evaluate mental function, the PASAT 3Citation23 was used. It consists of 60 true/false items, where a total of 0 indicates the worst function.

Data preparation and normalization

Assessment set

The data were recorded on a Microsoft Excel® spreadsheet (Microsoft Corporation, Redmond, WA) by an independent person with MS and were controlled by a second independent person with MS (paid by a project of European Social Fund Involving Training Workplaces for Disabled People).

Total scores obtained for tests in the assessment set were normalized to a scale from 0 to 1 (where 0 indicates the worst function and 1 indicates the best function). Normalization provides better orientation in scales, allows better comparison, and allows calculation of totals for all four extremity functions and total index of clinical functions (TICF). To calculate normalization, the minimum (min) possible value was subtracted from the measurement and this difference was then divided by the difference between the maximum (max) and min possible values: y=x-xminxmaxxmin(1)

If necessary, this was subtracted from 1 in the case of opposite scoring – that is, if 0 stands for the best function, which is the case for the MAS, T, DD, DM, KH, the NHPT, and the T25FW.

The logarithms of time measurements (NHPT and T25FW) were used for normalization. Minimum and maximum values were set to be 10 and 60 seconds for NHPT and 3 and 20 seconds for T25FW, respectively.

Six composite assessments

As well as the normalization of total scores, normalization of total scores for the extremities was calculated and averaged into the total extremity function. For normalized left (NLUEF) and right upper extremity function (NRUEF), the following normalized extremity total scores were averaged: normalized Modified Ashworth Scale (NMAS), normalized Motricity Index (NMI), normalized tremor, (NT), normalized dysdiadochokinesia (NDD), normalized dysmetria (NDM), and normalized Nine-Hole Peg Test (NNHPT). For normalized left (NLLEF) and right lower extremity function (NRLEF), the following normalized extremity total scores were averaged: NMAS, NMI, NT, NDD, NDM, and normalized knee hyperextension (NKH). The balance index (BI) was calculated as an average of normalized BBS and normalized PR scores. For TICF, all normalized measurements (normalized low-contrast letter acuity [NL-CLA], NMI, NMAS, normalized Berg Balance Scale [NBBS], normalized postural reactions [NPRs], NT, NDD, NDM, NKH, NNHPT, normalized Timed 25-Foot Walk [NT25FW], and normalized 3-minute version of the Paced Auditory Serial Addition Test [NPASAT 3]) of one patient were averaged.

Statistical analysis

The test–retest reliability was evaluated by intraclass correlation coefficient (ICC) (3,1), consistency version.Citation24 Stability of measurements (changes without treatment) and improvement after treatment were tested by paired t-test; P-values were corrected for multiple comparisons using false discovery rate correction.Citation25 Internal consistency of composite assessments (L-CLA, MI, MAS, BBS, T, DD, DM, PRs, KH, NHPT, and T25FW) was evaluated by Cronbach’s alpha, estimated from the second examination. Pearson correlations and a dendrogram of cluster analysis were used to assess connections between measurements. Spearman correlations were used to assess connections of clinical measures and EDDS scores. Statistical analyses were processed using software RCitation26 and its library psych.Citation27

Results

Seventeen patients were enrolled in the study. Descriptive statistics for all three measurements (Examinations 1, 2, and 3) are shown in . Descriptive statistics for normalized first measurements (Examination 1) are shown in . These statistics shows that in many assessments, MS patients reach only a narrow band of possible values – MI, MAS, BBS, T, and DM scores are generally higher than 60% (no patient had low values in these functions). On the other hand, the function of the KH test was lower than 40% in all the patients.

Table 2 Assessment set: descriptive statistics

Table 3 Descriptive statistics of normalized measurements (Examination 1)

All of the composite tests showed a good internal consistency (>0.75) (see ).

Table 4 Assessment set: test–retest reliability, stability (changes without treatment), and changes after therapy

There were no significant changes without treatment in any of the tests or composite tests (see ). Good test–retest reliability (>0.75) was obtained in seven of 12 tests (L-CLA, BBS, PRs, KH, NHPT, T25FW, and PASAT 3) and four composite tests (LLEF, RLEF, BI, TICF). The lowest ICC (0.39) was obtained for left upper extremity function.

Table 5 Internal consistency of composite assessments

All of the tests in the assessment set were sensitive to posttreatment changes: significant posttreatment improvement was proved in all tests in the battery except for L-CLA testing, where a trend to improvement was proved.

Correlations between normalized EDDS score and normalized clinical assessments are shown in . A greater number of patients would be needed to prove significance or to fit an optimal model of prediction of EDDS score (measured by a neurologist) by clinical assessment (measured by a therapist). The highest correlations were reached between normalized EDDS score and MI (0.60), NBBS (0.47), NHPT (0.46), and DD scores (0.44).

Table 6 Spearman correlations between normalized Expanded Disability Status Scale score and clinical assessments

Connections between assessments were estimated by Pearson correlations (see ). High correlations were found between NBBS and NMI scores (0.80) and NBBS and NT25FW scores (0.75), indicating the possible reduction of the battery. Generally, the correlation matrix and dendrogram of cluster analysis (see ) suggest that the proposed battery of tests is multidimensional and that it provides complex information on a patient’s clinical condition.

Figure 1 Pearson correlations between assessments.

Abbreviations: NL-CLA, normalized low-contrast letter acuity; NMI, normalized Motricity Index; NMAS, normalized Modified Ashworth Scale; NBBS, normalized Berg Balance Scale; NT, normalized tremor; NDD, normalized dysdiadochokinesia; NDM, normalized dysmetria; NPRs, normalized postural reactions (righting, equilibrium, and protective reactions); NKH, normalized knee hyperextension; NNHPT, normalized Nine-Hole Peg Test; NT25FW, normalized Timed 25-Foot Walk; NPASAT 3, normalized 3-minute version of the Paced Auditory Serial Addition Test.
Figure 1 Pearson correlations between assessments.

Figure 2 Dissimilarities between assessments (dendrogram of cluster analysis).

Abbreviations: NL-CLA, normalized low-contrast letter acuity; NMI, normalized Motricity Index; NMAS, normalized Modified Ashworth Scale; NBBS, normalized Berg Balance Scale; NT, normalized tremor; NDD, normalized dysdiadochokinesia; NDM, normalized dysmetria; NPRs, normalized postural reactions (righting, equilibrium, and protective reactions); NKH, normalized knee hyperextension; NNHPT, Normalized Nine-Hole Peg Test; NT25FW, normalized Timed 25-Foot Walk; NPASAT 3, normalized 3-minute version of the Paced Auditory Serial Addition Test.
Figure 2 Dissimilarities between assessments (dendrogram of cluster analysis).

Discussion

Seventeen patients were enrolled at the beginning of the study. As the study took a relatively long period of time, only 12 subjects completed the study, with the rest dropping out for personal or/and health reasons. The sample size was relatively small but comparable with other studies assessing psychometric properties of clinical tests mentioned in the literature.Citation13,Citation15,Citation28Citation31 Despite the small sample size, the results look consistent. The gender distribution, type of MS, spectrum of disease duration, age, and range of EDSS score represent ambulatory MS patients in general.

The assessment set

The battery of tests was prepared with the aim of evaluating clinical functions connected with motor deficit in patients with MS and to sensibly detect the types of changes connected with physiotherapy. Results of this study show that the chosen tests are sensitive to posttreatment changes. The authors are convinced that the assessment set may also be useful for detecting differences between therapies and their effects (Rasova K, unpublished data, 2012).

Recently it has been recommended that clinical practice in MS, including rehabilitation, should be based on the International Classification of Functioning, Disability and Health – a globally agreed upon framework and system for classifying the typical spectrum of problems in the functioning of people, given the environmental context in which they live.Citation32 Based on this model, there are many different domains that have to be measured and treated, and hence the authors assembled the proposed battery with 12 tests. Similarly, Paltamaa et al,Citation33 who also used the International Classification of Functioning, Disability and Health model, assembled a proposed battery with 12 tests, but of these 12 tests only the MAS and the BBS were the same as the tests selected by the present authors.

The most appropriate (standardized, quantitative, with minimal costs and special equipment, applicable in ambulatory practice, safe and feasible) generic and/or disease-specific measures were selected for inclusion in the assessment set from different domains of body function based on information available in the literature. The assessment set is multidimensional, in order to reflect the principal way in which MS affects clinical functions, and it provides mainly interval data. A skilled physiotherapist is able to perform the assessment set within 1 hour, which is the standard length of a physiotherapeutic examination paid for by health insurance. Patients were familiar with most of the tests other than the PASAT. No negative events such as muscle pain or tiredness were increased in connection with the tests. The assessment set is no doubt demanding in its requirements for organization and time. In usual clinical practice the domains and measures are chosen according to what is considered important for the MS subject or what effect the target of therapy wants to achieve (two to three primary and eventually two to three secondary outcomes are measured). On the other hand, the proposed assessment set provides objective, systematic, and multidimensional information required about clinical functions for efficient physiotherapy.

Tests in the battery: how they were chosen and their psychometric properties

Among clinical measures evaluating visual functions, contrast letter acuity (Sloan charts) and contrast sensitivity (Pelli–Robson chart) demonstrate the greatest capacity to identify binocular visual dysfunction in MS. Sloan chart testing also captures unique aspects of neurologic dysfunction not captured by current EDSS or Multiple Sclerosis Functional Composite (MSFC) components.Citation34 For this reason, the authors selected L-CLA testing for inclusion in the assessment set. The authors conclude that the L-CLA score demonstrates good test–retest reliability (ICC: 0.82) and good internal consistency (Cronbach’s alpha: 0.90). However, from the proposed battery of tests, L-CLA testing was the assessment least sensitive to posttreatment changes. Baier et alCitation11 confirmed a very good concurrent and predictive validity in patients with relapsing-remitting and secondary progressive MS (correlated with the EDSS and the MSFC) that provides additional information relevant to the MS disease process.

Several tests have been developed to evaluate motor impairment: the Motor Club Assessment, the Northwick Park Motor Assessment, the Rivermead Motor Assessment, the Medical Research Council Scale, and the Motricity index.Citation13 The authors selected the MI for inclusion in the assessment set because it is a simple and quick measure of the loss of voluntary motor power (general strength of movement at each joint of upper and lower extremities) that can also inform about general motor impairment of the extremities. For psychometric properties, a good to excellent criterion validity of lower extremities,Citation35 good upper extremity Pearson correlations with a handheld dynamometer, a good construct validity of upper extremities,Citation36 and a good interrater reliability and validity has been confirmed, although only in stroke patients.Citation13 In the present study, the MI demonstrated moderate test–retest reliability (ICC: 0.56) and good internal consistency (Cronbach’s alpha: 0.87).

Multiple biomechanical and electrophysiological methods for measuring muscle tone function have been developed (H-reflex testing, quantification of deep tendon reflexes and clonus, resonant frequency test, pendulum test, instrumented torque measurements during passive motion at present velocity, isokinetic dynamometry, and electromyography). Unfortunately, these methods have many limits – mainly that they need special equipment, differ in methodology, and are not accessible and administrable by clinicians.Citation37 It seems that for spasticity evaluation, clinical scales could be more useful. Twenty-four clinical scales that assess spasticity and/or related phenomena as well as ten scales for “active function” and three scales for “passive function” having an association with spasticity could be identified. For many scales, reliability data is missing.Citation38 However, the evaluation of spasticity is usually performed using the MAS, and this was the main reason why the authors selected this scale for inclusion in the assessment set. Nevertheless, there is not yet general accordance on the validity of this scale.Citation39 Some studies have reported the MAS to have a moderate to good interrater reliability,Citation15,Citation40 but most studies have reported poor reliability.Citation39,Citation40 Furthermore, poor intrarater agreement of the MAS has been confirmed.Citation33 In other studies, the intrarater reliability of the MAS was found to be either moderateCitation39 or good.Citation40 In the present study, the MAS demonstrated good internal consistency (Cronbach’s alpha: 0.78) but poor test–retest reliability (ICC: 0.49). Nevertheless, it is very sensitive to posttreatment changes (corrected P-value of <0.001).

Using the Ashworth Scale to evaluate spasticity is controversial because of its weak psychometric properties, as the relationship between spasticity and motor performance has not yet been confirmed. Furthermore, it is an ordinal scale that lacks sensitivity for detection of changes, and it uses constant speed for evaluation of spasticity; however, spasticity was defined as a velocity-dependent response to stretch. The AS not only evaluates spasticity but also passive resistance – the intrinsic properties of muscle, tendon, and connective tissue too.Citation37 Finally, the AS is not sensitive enough to detect changes in quality of life or functional outcomes.Citation38,Citation40

A variety of laboratory techniques and clinical scales have been proposed to evaluate balance,Citation16 but the instruments most commonly used in the clinical setting are clinical scales. Clinical scales provide insight for the planning of rehabilitation, are less expensive than laboratory techniques, do not require specific training of raters, and are easily applicable in the clinical setting. Of the clinical scales, the BBS, the Dynamic Gait Index, the Dizziness Handicap Inventory, the Timed Up and Go Test, the Ambulation Index, the Activities-Specific Balance Confidence Scale, the Functional Reach Test, and the Postural Stability Test have gained popularity within the clinical and scientific community for MS.Citation33,Citation42 The authors selected the BBS, which is the most frequently used test, for inclusion in the assessment set. The BBS was developed to measure balance among older people with impairment in balance function by assessing the performance of functional tasks.Citation16 The BBS was found to be a valid and reliable instrument in the elderly and post-stroke.Citation41 Psychometric properties of the BBS have also been evaluated for MS. The BBS shows a good concurrent validity (high specificity), bad discrimination validity (low sensitivity) that does not distinguish well between fallers and nonfallers,Citation42,Citation43 and good interrater (ICC: 0.96) and test–retest reliability (ICC: 0.96).Citation44 Results of the present study did confirm good test–retest reliability (ICC: 0.78), very good internal consistency (Cronbach’s alpha: 0.94) was also demonstrated.

For evaluation of righting, equilibrium, and protective reactions, the scale for evaluation of PR described by Corriveau et alCitation20 has been used previously. The present authors selected this evaluation for inclusion in the proposed battery, although this protocol was specially prepared to evaluate therapeutic modality developed by BobathCitation44 and quantifiable patient progress in connection with this concept. The present authors are convinced that this protocol is well prepared to evaluate PRs (righting, equilibrium, and protective reactions). This evaluation was validated with the Brunnstrom Scale, the Fugl-Meyer Test, the Upper Extremity Functional Test, and the present pain intensity scale of the McGill Pain Questionnaire. The protocol is sensitive to motor recovery over time. Results of the present study confirmed very good psychometric properties: test–retest reliability (ICC: 0.96) and internal consistency (Cronbach’s alpha: 0.92).

To evaluate tremor, accelerometer has been used as the objective method of measurement, and clinical rating systems and patient self-assessments have been used as the subjective methods of measurement.Citation45 In MS, the Fahn’s Tremor Rating ScaleCitation46,Citation47 and the Tremor Rating Scale are most frequently used.Citation48,Citation49 The Fahn’s Tremor Rating Scale was used for evaluation of upper and lower extremities in the present study, as it has good psychometric properties: high interrater reliability for intention tremor (kappa: 0.65–0.74)Citation45,Citation48 and very good intrarater reliability.Citation49,Citation50 Results confirm only moderate test–retest reliability (ICC: 0.61) and internal consistency (Cronbach’s alpha: 0.74).

Alusi et alCitation22 described fair to moderate psychometric properties in assessing dysmetria (intrarater reliability: kappa, 0.35–0.45; interrater reliability: 0.40–0.59) and dysdiadochokinesia (intrarater reliability: kappa, 0.47–0.59; interrater reliability: 0.33–0.58). Results of the present study showed poor test–retest reliability (ICC: 0.40) but very good internal consistency of DD (Cronbach’s alpha: 0.92). The results also demonstrated poor test–retest reliability (ICC: 0.47) but very good internal consistency of DM (Cronbach’s alpha: 0.82).

Genu recurvatum (knee extension greater than 5 degrees) is a common entity found in clinical practice. It is a consequence of poor control over the knee joint due to muscle weakness, impaired tonus, and deficit in joint proprioception. Uncontrolled locking of the knee during ambulation causes recurrent microtrauma, which leads to degenerative changes and instability.Citation51 However, this is a problem of neurological diseases in general, and research has predominantly involved stroke patients.Citation52,Citation53 Knee extension can be evaluated using different kinds of goniometers – handheld goniometer, electrogoniometer,Citation54 gravity-based goniometer,Citation55 fluid-based inclinometerCitation56 three-dimensional motion analysis system,Citation57 or goniometer based on gait analysis.Citation57 The authors did not have a validated electrogoniometer or a three-dimensional motion analysis system (which would be able to accurately locate the center of knee joint rotation), but a handheld goniometer was available. This is why the authors instead created a KH test that is easy, quick, and targeted to the knee function (standing and walking) – the function that the treatment targets. Results confirmed very good test–retest reliability (ICC: 0.98) and good internal consistency (Cronbach’s alpha: 0.85).

To evaluate fine motor skills in MS, the NHPT, the Box and Blocks Test, and the Purdue Pegboard Test are used. The NHPT is the most frequently used, mainly as part of the MSFC. For this reason, the authors selected the NHPT for inclusion in the assessment set. The interrater reliability of the NHPT is high (ICC: 0.84–0.96) and so is its intrarater reliability (ICC: 0.91–0.99).Citation58 Cutter et alCitation59 described modest correlation between the 1-year change in the NHPT results and change in the EDSS score (r = 0.27). Also, in the present study, results for the NHPT demonstrated very high test–retest reliability (ICC: 0.88) and very high internal consistency (Cronbach’s alpha: 0.93).

Many tests that evaluate walking can be found in the literature. Some of these tests are aimed at the measurement of velocity (10-Meter Walk Test, T25FW test, and Timed Tandem Gait), some of them at walking distance (2- or 6-Minute Walk Test), and some of them at the quality of walking – these tests assess walking as part of complex movement with the aim to change body position (Timed Up and Go Test, Functional Gait Assessment, Dynamic Gait Index, Ambulation Index, Tinetti Assessment Tool – Gait, and Kela Coordination Test).Citation33 The authors selected the T25FW test for inclusion in the assessment set because it is the most frequently used of the tests in MS, mainly as part of the MSFC. Its psychometric properties are also very good, having high inter- and intrarater reliability.Citation60,Citation61 The change in the timed walk and the change in EDSS score showed a correlation of r = 0.41.Citation60 In the present study, the results indicated very high test–retest reliability (ICC: 0.95) and very high internal consistency (Cronbach’s alpha: 0.96).

To evaluate mental function, the PASAT (2- and 3-minute versions), the Symbol Digit Modalities Test, the Controlled Oral Word Association Test, and the Mental Fatigue Scale are used in MS. The most frequently used is the PASAT 3, as part of the MSFC. This is why the authors selected this test for the assessment set. Solari et alCitation58 reported high inter- (ICC: 0.9–0.97) and intrarater reliability (ICC: 0.94–0.98) for the PASAT. Similarly, Rosti-Otajärvi et alCitation60 confirmed very good intra- (0.75–0.96) and interrater reliability (0.68–0.95). The internal consistency of the PASAT is excellent (split-half reliability: 0.96).Citation5 Also, the results of the present study showed very high test–retest reliability (ICC: 0.92). Besides significant improvement of NPASAT after treatment (P = 0.04), there is also some improvement without therapy (mean of 0.07), but this is not significant after correction for multiple comparisons. It is likely that the improvement is the result of the practice effect of patients (increasing familiarity with the test). Cutter et alCitation59 also described this practice effect.

Six composite assessments

Many neurological rating scales have been suggested to assess the impact of MS on patients, but none has been universally accepted. The EDSS is based on neurological examination of eight functional systems, usually performed by a neurologist. While problems of standardization, sensitivity (mainly to arm and cognitive changes), reliability, and rater-to-rater variability have been documented, the EDSS remains a useful tool for classifying MS patients by disease severity and has been used extensively to assess disability and its changes in MS.Citation60

Whitaker et alCitation61 emphasized the necessity of developing a new clinical rating scale that would be multidimensional, to reflect the varied clinical expression of MS across patients and over time, and would be able to register changes over time. Based on analyses of pooled data from natural history studies and from placebo groups in clinical trials, the National Multiple Sclerosis Society’s Clinical Outcomes Assessment Task Force has recently proposed a new multidimensional clinical outcome measure, the MSFC.Citation62 The MSFC comprises the T25FW test, the NHPT, and the PASAT 3 as a multidimensional test. Scores on component measures are converted to standard scores (z-scores), which are averaged to form a single MSFC score. The MSFC (z-score) shows excellent intra- (0.97, 0.97, and 0.99 for the T25FW test, NHPT, and PASAT 3, respectively) and interrater (0.95, 0.96, and 1.0 for the T25FW test, NHPT, and PASAT 3, respectively) reliability,Citation58,Citation60,Citation62 and it also shows strong evidence of face validity as well as convergent and divergent validity with the EDSS. Further, changes in the MSFC correlate with change in the EDSS (the MSFC change predicted subsequent change in the EDSS).Citation58,Citation62 Even with the increased variability in the early testing sessions due to the practice effect, the MSFC demonstrated excellent reliability.Citation62 The MSFC is a very good composite for MS, but it is not optimal – for example, when there are too many variables of which only a few exhibit change, the average shows little change.Citation59,Citation63,Citation64

In this study, the authors prepared six composite tests that characterize clinical functions that are important in physiotherapy: normalized left (NLUEF) and right upper extremity function (NRUEF), normalized left (NLLEF) and right lower extremity function (NRLEF), BI, and TICF. The authors found weak test–retest reliability (ICC: 0.39) in NLUEF and moderate test–retest reliability (ICC: 0.68) in NRUEF. In other composite measures, the authors found good test–retest reliability. These composite tests evaluate clinical functions in a complex way. These indexes document well the function of each extremity (muscle tone, strength, coordination, functional ability) and balance (proactive and reactive balance reactions); the TICF is a mathematical expression of the actual status of the MS patient from the therapist’s point of view.

The EDSS is based on neurological examination of eight functional systems, usually performed by a neurologist. The proposed assessment set was created for the clinical practice of a physiotherapist. The power of this assessment set to predict EDDS score should be verified in further study on a larger sample of patients.

Conclusion

In this study, the following achievements were made:

  • Standard outcome measures were prepared for use and validated in the Czech Republic. Sensitivity to posttreatment changes, good test–retest reliability and internal consistency were confirmed.

  • The normalization of standard outcome measures was introduced and their importance for orientation in examination results was shown.

  • A proposed battery of tests was designed comprising standard outcome measures and one test of the authors’ own that objectively and systematically evaluate clinical features of MS treatable by physiotherapy.

  • Six composite tests that evaluate function of left and right upper and lower extremities (NLUEF, NRUEF, NLLEF, and NRLEF), balance (BI), and total function (TICF) were introduced.

Based on experience from clinical practice and research, the authors can conclude that this battery of tests and six composite tests is practical to use, is acceptable to patients, is capable of demonstrating effects of rehabilitation, and can be used with confidence to evaluate effects of physiotherapy in MS.

Acknowledgements

The authors would like to thank the Ministry of Health of the Czech Republic (1A/8628-5), the Ministry of Education, Youth and Sports of the Czech Republic (project 1M06014), RVO:67985807, a European Social Fund project (Involving Training Workplaces for Disabled People), the state budget of the Czech Republic, and the budget of the city of Prague for supporting this study.

Disclosure

The authors report no conflicts of interest in this work.

References

  • ThompsonAJSymptomatic management and rehabilitation in multiple sclerosisJ Neurol Neurosurg Psychiatry200171Suppl 2ii22ii2711701781
  • HenzeTRieckmannPToykaKVfor Multiple Sclerosis Therapy Consensus Group of the German Multiple Sclerosis SocietySymptomatic treatment of multiple sclerosisEur Neurol20065627810516966832
  • KhanFTurner-StokesLNgLKilpatrickTMultidisciplinary rehabilitation for adults with multiple sclerosisCochrane Database Syst Rev20072CD00603617443610
  • ThompsonAJNeurorehabilitation in multiple sclerosis: foundations, facts and fictionCurr Opin Neurol200518326727115891410
  • ThompsonAJThe effectiveness of neurological rehabilitation in multiple sclerosisJ Rehabil Res Dev200037445546111028701
  • HaighRTennantABiering-SørensenFThe use of outcome measures in physical medicine and rehabilitation within EuropeJ Rehabil Med200133627327811766957
  • KurtzkeJFRating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS)Neurology19833311144414526685237
  • McDonaldWICompstonAEdanGRecommended diagnostic criteria for multiple sclerosis: guidelines from the International Panel on the Diagnosis of Multiple SclerosisAnn Neurol200150112112711456302
  • OldfieldRCThe assessment and analysis of handedness: the Edinburgh inventoryNeuropsychologia197191971135146491
  • Management of substance abuse: process of translation and adaptation of instruments [web page on the Internet]Geneva, SwitzerlandWorld Health Organization2012 Available from: http://www.who.int/substance_abuse/research_tools/translation/en/index.html. Accessed March 15, 2011
  • BaierMLCutterGRRudickRALow-contrast letter acuity testing captures visual dysfunction in patients with multiple sclerosisNeurology200564699299515781814
  • DemeurisseGDemolORobayeEMotor evaluation in vascular hemiplegiaEur Neurol19801963823897439211
  • CollinCWadeDAssessing motor impairment after stroke: a pilot reliability studyJ Neurol Neurosurg Psychiatry19905375765792391521
  • ElovicEBaergaEAssociated topics in physical medicine and rehabilitation: spasticityCuccurulloSJPhysical Medicine and Rehabilitation Board ReviewNew YorkDemos Medical Publishing2004743750
  • BohannonRWSmithMBInterrater reliability of a modified Ashworth Scale of muscle spasticityPhys Ther19876722062073809245
  • BergKWood-DauphineeSWilliamsJIThe Balance Scale: reliability assessment with elderly residents and patients with an acute strokeScand J Rehab Med19952712736
  • Berg Balance Scale American Academy of Health and Fitness. Available from: http://www.aahf.info/pdf/Berg_Balance_Scale.pdf. Accessed February 19, 2009.
  • GuarnaFCorriveauHChamberlandJArsenaultABDutilEDrouinGAn evaluation of the hemiplegic subject based on Bobath approach: Part I. The modelScand J Rehab Med198820114
  • DaviesPMSteps to Follow: A Guide to the Treatment of Adult Hemiplegia; Based on the Concept of K. and B. BobathBerlinSpringer-Verlag1993
  • CorriveauHGuarnaFDutilERileyEArsenaultABDrouinGAn evaluation of the hemiplegic subject based on the Bobath approach: Part II. The evaluation protocolScand J Rehabil Med19882015113413455
  • FahnSTolosaEMarinCClinical rating scale for tremorJankovicJTolosaEParkinson’s Disease and Movement Disorders2nd edBaltimore (MD)Williams and Wilkins1993271280
  • AlusiSHWorthingtonJGlickmanSFindleyLJBainPGEvaluation of three different ways of assessing tremor in multiple sclerosisJ Neurol, Neurosurg Psychiatry200068675676010811700
  • Coulthard-MorrisLClinical and rehabilitation outcome measuresBurksJSJohnsonKPMultiple Sclerosis: Diagnosis, Medical Management, and RehabilitationNew YorkDemos Medical Publishing2000236290
  • ShroutPEFleissJLIntraclass correlations: uses in assessing rater reliabilityPsychol Bull197986242042818839484
  • BenjaminiYHochbergYControlling the false discovery rate: a practical and powerful approach to multiple testingJ R Stat Soc Ser B1995571289300
  • R Development Core TeamR: A Language and Environment for Statistical ComputingVienna, AustriaR Foundation for Statistical Computing2012 Version 2.14.2 (2012-02-29), Available from: http://www.R-project.org/. Accessed March 4, 2012
  • RevelleWPackage Psych: Procedures for Personality and Psychological Research1.0–92 edEvanstonNorthwestern University2010 Available from: http://www.personality-project.org/r/book/psych.pdf. Accessed August 9, 2012
  • CameronDBohannonRWCriterion validity of lower extremity Motricity Index scoresClin Rehabil200014220821110763800
  • GregsonJMLeathleyMJMooreAPSmithTLSharmaAKWatkinsCLReliability of measurements of muscle tone and muscle power in stroke patientsAge Ageing200029322322810855904
  • NuyensGDe WeerdtWKetelaerPInter-rater reliability of the Ashworth Scale in multiple sclerosisClin Rehabil199484286292
  • BlumLKorner-BitenskyNUsefulness of the Berg Balance Scale in stroke rehabilitation: a systematic reviewPhys Ther200888555956618292215
  • World Health Organization (WHO)International Classification of Functioning, Disability and Health: ICFGeneva, SwitzerlandWHO2001
  • PaltamaaJWestHSarasojaTWikströmJMälkiäEReliability of physical functioning measures in ambulatory subjects with MSPhysiother Res Int20051029310916146327
  • BalcerLJBaierMLPelakVSNew low-contrast vision charts: reliability and test characteristics in patients with multiple sclerosisMult Scler20006316317110871827
  • BohannonRWAndrewsAWStandards for judgments of unilateral impairments in muscle strengthPercept Mot Skills1999893 Pt 187888010665019
  • DamianoDLQuinlivanJMOwenBFPaynePNelsonKCAbelMFWhat does the Ashworth Scale really measure and are instrumented measures more valid and precise?Dev Med Child Neurol200244211211811848107
  • PlatzTEickhofCNuyensGVuadensPClinical scales for the assessment of spasticity, associated phenomena, and function: a systematic review of the literatureDisabil Rehabil2005271–271815799141
  • DarioATomeiGManagement of spasticity in multiple sclerosis by intrathecal baclofenActa Neurochir Suppl200797Pt 118919217691376
  • AnsariNNNaghdiSMoammeriHJalaieSAshworth scales are unreliable for the assessment of muscle spasticityPhysiother Theory Pract200622311912516848350
  • AnsariNNNaghdiSArabTKJalaieSThe interrater and intra-rater reliability of the Modified Ashworth Scale in the assessment of muscle spasticity: limb and muscle group effectNeuro Rehabilitation200823323123718560139
  • HobartJCRiaziAThompsonAJGetting the measure of spasticity in multiple sclerosis: the Multiple Sclerosis Spasticity Scale (MSSS-88)Brain2006129Pt 122423416280352
  • CattaneoDJonsdottirJRepettiSReliability of four scales on balance disorders in persons with multiple sclerosisDisabil Rehabil200729241920192517852286
  • CattaneoDRegolaAMeottiMValidity of six balance disorders scales in persons with multiple sclerosisDisabil Rehabil2006281278979516754576
  • ArsenaultABDutilELambertJCorriveauHGuarnaFDrouinGAn evaluation of the hemiplegic subject based on the Bobath approach: Part III. A validation studyScand J Rehabil Med198820113163413450
  • BainPGFindleyLJAtchisonPAssessing tremor severityJ Neurol Neurosurg Psychiatry19935688688738350102
  • FeysPD’hoogheMNagelsGHelsenWFThe effect of levetiracetam on tremor severity and functionality in patients with multiple sclerosisMult Scler200915337137819168602
  • FeysPGDavies-SmithAJonesRIntention tremor rated according to different finger-to-nose test protocols: a surveyArch Phys Med Rehabil2003841798212589625
  • PlahaPKhanSGillSSBilateral stimulation of the caudal zona incerta nucleus for tremor controlJ Neurol Neurosurg Psychiatry200879550451318037630
  • BryantJADe SallesACabatanCFrysingerRBehnkeEBronsteinJThe impact of thalamic stimulation on activities of daily living for essential tremorSurg Neurol200359647948412826348
  • HooperJTaylorRPentlandBWhittleIRRater reliability of Fahn’s tremor rating scale in patients with multiple sclerosisArch Phys Med Rehabil1998799107610799749687
  • LoudonJKGoistHLLoudonKLGenu recurvatum syndromeJ Orthop Sports Phys Ther19982753613679580896
  • BasagliaNMazziniNBoldriniPBacciglieriPContentiEFerraresiGBiofeedback treatment of genu-recurvatum using an electrogoniometric device with an acoustic signal: one-year follow-upScand J Rehabil Med19892131251302799310
  • TruebloodPRWalkerJMPerryJGronleyJKPelvic exercise and gait in hemiplegiaPhys Ther198969118262911613
  • ChaoEYLaughmanRKSchneiderSSStaufferRNNormative data of knee joint motion and ground reaction forces in adult level walkingJ Biomech19831632192336863337
  • EkstrandJWiktorssonMObergBGillquistJLower extremity goniometric measurements: a study to determine their reliabilityArch Phys Med Rehabil19826341711757082141
  • RheaultWMillerMNothnagelPStraessleJUrbanDIntertester reliability and concurrent validity of fluid-based and universal goniometers for active knee flexionPhys Ther19886811167616783186793
  • PomeroyVMEvansERichardsJDAgreement between an electrogoniometer and motion analysis system measuring angular velocity of the knee during walking after strokePhysiotherapy2006923159165
  • SolariARadiceDManneschiLMottiLMontanariEThe multiple sclerosis functional composite: different practice effects in the three test componentsJ Neurol Sci20052281717415607213
  • CutterGRBaierMLRudickRADevelopment of a multiple sclerosis functional composite as a clinical trial outcome measureBrain1999122Pt 587188210355672
  • Rosti-OtajärviEHämäläinenPKoivistoKHokkanenLThe reliability of the MSFC and its componentsActa Neurol Scand2008117642142718081910
  • WhitakerJNMcFarlandHFRudgePReingoldSCOutcomes assessment in multiple sclerosis clinical trials: a critical analysisMult Scler19951137479345468
  • RudickRACutterGBaierMUse of the Multiple Sclerosis Functional Composite to predict disability in relapsing MSNeurology200156101324133011376182
  • CohenJAFischerJSBolibrushDMIntrarater and interrater reliability of the MS functional composite outcome measureNeurology200054480280610690966
  • SyndulkoKKeDEllisonGWBaumhefnerRWMyersLWTourtellotteWWfor Multiple Sclerosis Study GroupComparative evaluations of neuroperformance and clinical outcome assessments in chronic progressive multiple sclerosis: I. Reliability, validity and sensitivity to disease progressionMult Scler1996231421569345379