1,966
Views
4
CrossRef citations to date
0
Altmetric
Musculoskeletal: Original article

Assessing the burden of illness from cervical dystonia using the Toronto Western Spasmodic Torticollis Rating Scale scores and health utility: a meta-analysis of baseline patient-level clinical trial data

, , , , , & show all
Pages 803-809 | Accepted 07 Aug 2014, Published online: 29 Aug 2014

Abstract

Purpose:

This study aimed to explore the burden of illness associated with cervical dystonia (CD), including possible demographic and humanistic correlates of baseline disease severity.

Methods:

The analysis involved the five multinational randomized, placebo-controlled clinical trials that had evaluated the efficacy and safety of Dysport® in patients with CD, including assessment using the Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS). Patient-level TWSTRS scores from the individual studies were meta-analysed to estimate disease severity at baseline. One of the studies had reported Short Form-36 (SF-36) Health Survey quality-of-life measures, and these data were used to investigate whether the severity of CD was associated with humanistic outcomes, as measured by health utility. A generalized regression model was then applied to explore potential correlation between TWSTRS scores and utilities.

Results:

The estimated pooled mean baseline severity of CD in clinical trial entrants, as measured by TWSTRS score, was 43.23 (95% CI = 39.31–47.15). In general, disease severity was significantly greater in patients aged over 40 years (compared to the reference group aged 18–30 years). However, there was no correlation between disease severity and other demographic characteristics (e.g., weight, height, gender). Higher TWSTRS scores correlated with worse health-related quality of life as perceived by patients and was reflected in health utility (R2 = 0.133).

Conclusions:

This study was able to define TWSTRS scores in patients with CD in terms of associated utility. This approach could help in capturing the disease’s burden through measures that are more tangible than TWSTRS scores to patients, carers, clinicians, and healthcare payers.

Introduction

Estimates suggest that ∼89 per million people worldwide have cervical dystonia (CD)Citation1, a condition characterized by sustained involuntary hyperkinetic movements of skeletal muscles in the neck. The problem often results in repetitious awkward and irregular posturesCitation2, and individuals who are left untreated may go on to develop permanent disability. So while CD is uncommon, its consequences represent a considerable burden to patients and wider society.

Of note, there is significant variability in the clinical presentation of CD, and poor recognition of this spectrum within the medical communityCitation3. Also, the functional consequences of the condition are poorly understood and difficult to assess clinically. Therefore, identification and grading of different types of CD depend primarily on the diagnostic skills of the assessing physician. Specifically, there are neither standard diagnostic algorithms nor objective functional parameters for CD. Moreover, none of the currently used rating scales has been rigorously tested for responsiveness to significant treatment-related changes in the clinical status of patients with the condition and, in any case, might be too complex for use in routine practiceCitation4,Citation5. This includes the Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS)—a validated scaleCitation6–8 (range = 0–85) that takes account of both the functional features of CD (severity sub-scale range = 0–35) and features important to patients (e.g., disability and pain sub-scales, range = 0–30 and 0–20, respectively), but for which there is no agreed definition of how the scores relate to specific, clinically recognizable, levels of physical dysfunction (beyond higher scores generally representing poorer function).

Collectively, such factors mean that there are key unknowns about the functional burden of illness in patients with CD and its inter-relationship with humanistic outcomes. Insights into these areas might come from analysis of baseline clinical and health-related quality-of-life (HRQoL) data for entrants of clinical trials of interventions for CD, particularly if such individuals were naїve to, or had not recently received, treatment that could have altered their disease severity. Such investigation also offers the potential advantage of embracing a significant sample of the total population with CD, given the condition’s rarity.

Against this background, the current study aimed to investigate the baseline burden of illness associated with CD, by (i) using meta-analysis of patient-level data on TWSTRS in entrants of clinical trials of abobotulinumtoxinA (Dysport®); and (ii) exploring possible demographic and humanistic correlates of baseline disease severity.

Methods

Study selection and participants

The analysis included the five (out of a total of six) multi-national, multi-center, randomized, double-blind clinical trials at Phases II and III that had evaluated the efficacy and safety of abobotulinumtoxinA in patients with CD (n = 87–404), including assessment using the TWSTRS: Y-47-52120-051 (clinical registration number NCT00257660)Citation9; Y-97-52120-045Citation10; Y-97-52120-045bCitation8; Y-79-52120-131 (NCT00833196)Citation11; Y-47-52120-731 (NCT00288509)Citation7; and Y-97-52120-020Citation12). The sixth trial (Y-97-52120-020) did not use the TWSTRS. The included studies were considered particularly suitable for assessing baseline burden of illness associated with CD as they included only patients who were either naїve to botulinum toxin-A (BoNT-A) or had last been injected with it at least 12 weeks previously (and so had returned, at best, to their usual pre-treatment status by the time of enrolment).

Baseline disease severity and other patient characteristics

Patients in all five clinical trials had been assessed at baseline using TWSTRS scores as a measure of the severity of CD. Other key characteristics recorded for each participant included demographic factors (e.g., gender, age, and ethnicity); duration of disease; types of previous BoNT-A treatment; and relevant clinical patient profiles (including history of CD, other medical history, and previous use of BoNT-A).

One of the five studies (NCT00257660) had also reported Short Form-36 (SF-36) Health Survey HRQoL measures at baseline and week 8. These data were used to investigate potential correlations between humanistic measures and disease severity in CD, by applying the indirect method of health utility valuation. Specifically, this involved mapping the trial’s results for SF-36 (a general HRQoL instrument) to the utility algorithm of EuroQol-5 dimensions (EQ-5D) (a generic preference instrument)Citation13. The reliability and accuracy of any algorithm is influenced by how well it takes account of various characteristics of the patients whose data are being mapped, including setting (e.g., inpatients vs outpatients); medical conditions (e.g., different International Classification of Diseases version 10 (ICD-10) classifications); and the population norm (e.g., for the country, or countries, of interest). Mapping models described in the literature have tended to over-predict the extent to which particularly severe impairment of SF-36 HRQoL is reflected by the EQ-5DCitation14–18. Therefore, in this study we used an adapted version of the algorithm of Rowen et al.Citation15 for the mapping function because of the following: it was based on a large patient data-set; it had been generated from a random effects model; and, in comparison to other algorithms, it offered the most accurate predictions. The utility score in this algorithm is presented on a scale on which the upper boundary, 1.00, represents full HRQoL, and death is represented by 0.00.

Statistical analysis

Two analytical strategies were used in the study. Firstly, meta-analytical models were used to summarize the baseline severity of CD as reflected by TWSTRS score, and then a generalized regression model was applied to explore possible demographic and humanistic correlates of baseline disease severity.

Meta-analytic approach

The meta-analytic approach involved applying the formalized quantitative synthesis method of the Cochrane CollaborationCitation19 to generate summary statistics for baseline TWSTRS total scores and scores for the disability, pain, and severity sub-scales, from the individual studies. Of note, this analysis had to address variation (or heterogeneity) in the data (within and between the individual trials) that could have influenced the results of these studies. This variation had several potential sources.

Within each of these trials, the data had been collected from multiple centers and, as a result, participants at any one center had more in common than those recruited at other centers. Therefore, any variation in the results within a treatment group reflects not only variation between individual patients, but also an overlying variation between centers (i.e., the data are hierarchical or ‘clustered’). This means that any differences in the results between centers cannot be assumed to relate solely to between-patient differencesCitation20. Similarly, there is clustering of data at the trial, as well as the center, level. This is because differences between the studies with regards to design (e.g., trial protocols) and execution (e.g., how rigidly protocols have been followed) will have influenced how representative their patient samples are of the target population for the planned analysis.

In view of this inevitable heterogeneity within and between the five trials (relating to differences between patients, centers, and trials), the DerSimonian and LairdCitation21 random effects model (D+L) was applied in conducting the meta-analysis. This model weighted the baseline mean TWSTRS score (total or sub-scale) from each trial by dividing it by the variance in the scores from that study; the weighted scores from the different studies were then combined by meta-analysis. Consequently, the more heterogeneous the TWSTRS scores within a particular study, the lower their weighting in generating the summary statistics. It is important to note, however, the variance used to calculate the weighted TWSTRS score for a particular trial was determined wholly by the mean score, the variability of study results around this mean, and the study size. Therefore, the weight a study was given in the meta-analysis was determined solely by the mean TWSTRS score (i.e., the effect size) and the precision with which the study had estimated this variable. This means that the weighting had no direct relationship to what a particular study might have to offer in terms of, for example, methodological quality or clinical validity.

In addition to the mean TWSTRS scores, individual-level data were also available for each of the trials, and this allowed the application of meta-regressionCitation22 to investigate whether there was an inter-relationship between baseline total TWSTRS score and various patient characteristics. Meta-regression can recognize the fact that patients are from different trials, and the ability to take account of trial-level covariates distinguishes this methodology from standard regression techniques.

Generalized linear regression

The planned analysis required a model to explore potential correlation between baseline TWSTRS scores and utilities. However, the data were not suitable for analysis using a standard linear model for various reasons. First, such a model would have assumed that the differences between patients’ directly measured utility and the utilities predicted by the model (the so-called ‘residuals’ from the model for utility) would follow a normal distribution, which is unlikely given that the distribution of utility in this study was left-skewed. A standard linear model would also have assumed that utility was ‘unbounded’ (potentially lying outside the range of 0–1). Also, it seemed unlikely that a linear relationship between TWSTRS score and utility score would accurately reflect the fact that, at the more severe end of the disease spectrum, progressive increases in TWSTRS score are likely to be associated with increasingly marginal changes in utility (a so-called ‘ceiling effect’). Another key issue is that a simple linear model would have assumed that any between-patient variance in the data was entirely a random phenomenon. In reality, however, there is inherent, unavoidable, non-random variance due to the way the data were collected. This is because there were repeated measurements for individual patients (i.e., TWSTRS and SF-36 scores were measured at both baseline and week 8 for each patient), and clearly the variation between data for an individual is unlikely to be the same as that between individuals (i.e., the data exhibit heterogeneity or non-constant variance). A standard linear regression model would have overlooked this complexity and so led to errors in describing the inter-relationship between TWSTRS and utility scores.

To address all of these issues, the current analysis used a generalized linear regression model (GLM), which has various characteristics that make it more generally applicable than a traditional linear model. In particular, this type of model can take account of the non-constant variance in the data described above. To do this, the GLM used a logit (logistic) function (which is based on a binomial distribution) to describe the relationship between the utility and TWSTRS. This allowed the relationship to be represented graphically as an S-shaped curve that, crucially, could not go beyond the limits of 0 and 1, the range for the utility scores.

Results

Demographic characteristics of patients with CD and descriptive analysis of TWSTRS total scores at baseline in each trial are presented in . Most of the patients were females aged 41–70 years. The forest plot in shows a pooled mean TWSTRS total score of 43.23 (95% confidence interval [CI] = 39.31–47.15) as estimated by the DerSimonian and Laird random effects model. The estimated pooled mean scores for the TWSTRS sub-scales of disability, pain, and severity were 12.37 (95% CI = 11.35–13.40), 9.37 (95% CI = 7.71–11.02), and 19.39 (95% CI = 18.77–20.01), respectively (see ). Since the trial data had been collected from multiple centers within each study, they were, as expected, hierarchical in nature, as indicated by statistically significant variation of TWSTRS measures both between patients and between trials (attributed heterogeneity 97.4%, p < 0.00001).

Figure 1. Forest plot—burden of disease by means of TWSTRS in patients with CD. Note: D + L: DerSimonian and Laird random effects model and IPD: meta-regression using individual-patient data. Clinical trial study ID (registration number): Y-47-52120-051 (NCT00257660), Y-97-52120-045 (NA), Y-97-52120-045b (NA), Y-79-52120-131 (NCT00833196), Y-47-52120-731 (NCT00288509) and Y-97-52120-020 (NA); ClinicalTrials.gov. ES: estimate; Cl: confidence interval.

Figure 1. Forest plot—burden of disease by means of TWSTRS in patients with CD. Note: D + L: DerSimonian and Laird random effects model and IPD: meta-regression using individual-patient data. Clinical trial study ID (registration number): Y-47-52120-051 (NCT00257660), Y-97-52120-045 (NA), Y-97-52120-045b (NA), Y-79-52120-131 (NCT00833196), Y-47-52120-731 (NCT00288509) and Y-97-52120-020 (NA); ClinicalTrials.gov. ES: estimate; Cl: confidence interval.

Table 1. Summary statistics of TWSTRS total and sub-scale scores and demographic characteristics of each trial.

Table 2. Disease severities in terms of mean TWSTRS total and sub-scale scores as measured by DerSimonian and Laird method in patients with CD.

Meta-regression showed that disease severity varied significantly according to patients’ age, with those aged 41–50 years and 51–60 years having significantly higher TWSTRS scores (5.89 [p = 0.001] and 3.90 [p = 0.025]), respectively than did those aged 18–30 years (the reference group) in . There was, however, no association between other demographic characteristics (e.g., weight, height, and gender) and TWSTRS score. The estimated TWSTRS score was 44.64 (95% CI = 44.52–44.75) for male patients aged 18–30 years of average weight (71.2 kg) and height (169 cm).

Table 3. Disease severities in terms of mean TWSTRS total associated with age groups measured by meta-regression in patients with CD.

A sensitivity analysis was carried out to explore the effect of excluding one of the trials (NCT00833196), as in this study (unlike the other four), entry requirements included having a TWSTRS severity sub-scale score of more than 15 (with no requirements being specified with regards to the other two sub-scales); so entrants may have had better physical function at baseline than those in the other trials. This analysis showed that the significant between-trial variation disappeared and the estimated TWSTRS score for the remaining four trials increased from 43.23 to 44.88 (95% CI = 44.1–45.6).

The results from the GLM showed that higher TWSTRS total scores were correlated with worse HRQoL as perceived by patients in terms of health utility (coefficient of determination [R2] = 0.133, p = 0.0001; see ). A 10-point improvement in TWSTRS score was linked to a 0.04-point increase in the strength of an individual’s preferences for HRQoL in CD on a utility scale of 0–1.

Figure 2. Mapping of TWSTRS score to observed health utility. Note: R2, r-squared (coefficient of determination), a statistical measure of how well a regression line approximates real data points, an r-squared of 1.0 (100%) indicates a perfect fit. TWSTRS, Toronto Western Spasmodic Torticollis Rating Scale.

Figure 2. Mapping of TWSTRS score to observed health utility. Note: R2, r-squared (coefficient of determination), a statistical measure of how well a regression line approximates real data points, an r-squared of 1.0 (100%) indicates a perfect fit. TWSTRS, Toronto Western Spasmodic Torticollis Rating Scale.

Conclusions

According to our study, the estimated pooled mean baseline severity of illness associated with CD in entrants of clinical trials of abobotulinumtoxinA was a TWSTRS score of 43.23 (95% CI = 39.31–47.15). Age was the main demographic factor related to the estimated mean TWSTRS score, being negatively correlated with baseline disease severity. Also, the higher the TWSTRS score, the worse HRQoL as perceived by patients. Of note, sensitivity analysis showed that one of the five trials assessed (NCT00833196) contributed all of the between-trial variation in terms of the difference of mean TWSTRS score and that it was a very different study from the others included in the quantitative analysis (with its participants possibly having comparatively less physical dysfunction at baseline).

Our results need to be considered in light of our previous understanding of CD and its effects on patients. Other studies have also demonstrated that the condition has a considerable negative impact on HRQoLCitation23. Our demonstration of a correlation between age and baseline disease severity fits with previous suggestions that CD may be under-diagnosed or misdiagnosed (for more than 1 year)Citation4,Citation5,Citation24, in that patients might only be recognized when the disease has progressed and, thereby, become more clinically distinct. However, we found no evidence of a correlation between disease severity and gender.

Key strengths of our study include its use of patient-level data, which allowed quantification of the functional burden of CD at baseline across clinical trials using meta-analytical methods, controlling for between-study variation. Also, the combination of results from multiple studies increased the power of the analysis to detect small but clinically significant effects, as well as more precise estimates of any effects identified.

Against the advantages of the current study must be set some limitations. One of these concerns is the fact that our analysis was necessarily limited to studies for which we had access to patient-level data (i.e., trials of abobotulinumtoxinA) rather than all trials of botulinum toxin for cervical dystonia. Another possible concern is the reliability of the mapping approach employed in this study for SF-36 onto the EQ-5D. While this has proven to be consistent across healthcare settings and medical conditionsCitation15, it is possible that the technique could over-predict the severity of CD for more severe EQ-5D states. This possibility could not have been eliminated, however, regardless of the mapping functions chosen for the analysisCitation12. Another potential weakness of our study is that the relationship between utility and TWSTRS described by a GLM was based on data from a single trial, which limits the robustness of the correlation defined. In particular, the limited information in the trial report means that our analysis could not take account of all potentially important aspects of disease burden associated with CD. Finally, the trial populations may have been unrepresentative of patients with CD in general, so biasing the synthesis of the empirical data. On the other hand, however, the rarity of the disease would make it difficult to define and gather data on a more typical group of patients.

Overall, we believe that our findings could help in capturing the humanistic effects of the illness associated with CD in terms that are meaningful to patients, carers, and clinical professionals. TWSTRS score is widely used to measure disease severity, but, in reality, has limited usefulness for evaluating the burden of illness because, while a higher score reflects poorer physical function, there is no explicit definition of how specific scores relate to specific levels of dysfunctionCitation24. In this study, we were able to define TWSTRS scores in terms of associated measurements of utility. Our findings may, therefore, help healthcare professionals in assessing the severity of CD and being aware of how this correlates with HRQoL, thereby facilitating better targeting of therapeutic interventions to lessen the burden associated with the disease.

Transparency

Declaration of funding

The funding was provided by Ipsen Pharma SAS, Boulogne-Billancourt, France.

Declaration of financial/other relationships

HK, JD, and SG are employees of Ipsen, and WJ is a consultant for Ipsen and acted in this capacity during the course of the study. (He is also a consultant for Allergan and Merz.) RW and II are employees of Evidera, as was M-HJ when the study was conducted. Evidera was commissioned by Ipsen to undertake the research, including preparation of this manuscript. JME peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

Notes

*Dysport® is a registered trademark of Ipsen Biopharm Ltd.

References

  • Nutt JG, Muenter MD, Aronson A, et al. Epidemiology of focal and generalized dystonia in Rochester, Minnesota. Mov Disord Official J Mov Disord Soc 1988;3:188-94
  • Steeves TD, Day L, Dykeman J, et al. The prevalence of primary dystonia: a systematic review and meta-analysis. Mov Disord Official J Mov Disord Soc 2012;27:1789-96
  • Van Zandijcke M. Cervical dystonia (spasmodic torticollis). Some aspects of the natural history. ActaN Belg 1995;95:210-15
  • Jost WH, Hefter H, Stenner A, et al. Rating scales for cervical dystonia: a critical evaluation of tools for outcome assessment of botulinum toxin therapy. J Neural Transm 2013;120:487-96
  • Logroscino G, Livrea P, Anaclerio D, et al. Agreement among neurologists on the clinical diagnosis of dystonia at different body sites. J Neurol Neurosurg Psychiatry 2003;74:348-50
  • Consky ES, Lang AE. Clinical assessments of patients with cervical dystonia. In: Jankovic J, Hallett M, eds. Therapy with botulinum toxin. New York, NY: Marcel Dekker, 1994. pp. 211-38
  • Comella CL, Stebbins GT, Goetz CG, et al. Teaching tape for the motor section of the Toronto Western Spasmodic Torticollis Scale. Mov Disord Official J Mov Disord Soc 1997;12:570-5
  • Novak I, Campbell L, Boyce M, et al. Botulinum toxin assessment, intervention and aftercare for cervical dystonia and other causes of hypertonia of the neck: international consensus statement. Eur J Neurol Off J Eur Feder Neurol Soc 2010;17(2 Suppl):94-108
  • Truong D, Brodsky M, Lew M, et al. Long-term efficacy and safety of botulinum toxin type A (Dysport) in cervical dystonia. Parkinsonism Relat Disord 2010;16:316-23
  • Truong D, Duane DD, Jankovic J, et al. Efficacy and safety of botulinum type A toxin (Dysport) in cervical dystonia: results of the first US randomized, double-blind, placebo-controlled study. Mov Disord Official J Mov Disord Soc 2005;20:783-91
  • Misra VP, Ehler E, Zakine B, et al; group IIC. Factors influencing response to Botulinum toxin type A in patients with idiopathic cervical dystonia: results from an international observational study. BMJ Open 2012;2:e000881
  • Poewe W, Deuschl G, Nebe A, et al. What is the optimal dose of botulinum toxin A in the treatment of cervical dystonia? Results of a double blind, placebo controlled, dose ranging study using Dysport. German Dystonia Study Group. J Neurol Neurosurg Psychiatry 1998;64:13-17
  • National Institute for Health and Clinical Excellence (NICE). Guide to the methods of technology appraisal. London, UK: 2008. www.nice.org.uk/media/B52/A7/TAMethodsGuideUpdatedJune2008.pdf. Accessed May 23, 2013
  • Franks P, Lubetkin EI, Gold MR, et al. Mapping the SF-12 to the EuroQol EQ-5D Index in a national US sample. Med Decis Making Int J Soc Med Decis Making 2004;24:247-54
  • Rowen D, Brazier J, Roberts J. Mapping SF-36 onto the EQ-5D index: how reliable is the relationship? Health Qual Life Outcomes 2009;7:27
  • Sullivan PW, Ghushchyan V. Mapping the EQ-5D index from the SF-12: US general population preferences in a nationally representative sample. Med Decis Making Int J Soc Med Decis Making 2006;26:401-9
  • Powell JL. Least absolute deviations estimation for the censored regression model. J Econometr 1984;25:303-25
  • Gray AM, Rivero-Arias O, Clarke PM. Estimating the association between SF-12 responses and EQ-5D utility values by response mapping. Med Decis Making Int J Soc Med Decis Making 2006;26:18-29
  • Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian approaches to clinical trials and health care evaluation. The Atrium, Southern Gate, Chicheter, West Sussex, England: John Wiley & Sons Ltd, 2004
  • Rice N, Jones A. Multilevel models and health economics. Health Econ 1997;6:561-75
  • DerSimonian R, Kacker R. Random-effects model for meta-analysis of clinical trials: an update. Contemp Clin Trials 2007;28:105-14
  • Thompson SG, Higgins JP. How should meta-regression analyses be undertaken and interpreted? Stat Med 2002;21:1559-73
  • Coelho M, Valadas AF, Mestre T, et al. Pain and quality of life in the treatment of cervical dystonia. Eur Neurol Rev 2009;4:74-8
  • Molho ES, Agarwal N, Regan K, et al. Effect of cervical dystonia on employment: a retrospective analysis of the ability of treatment to restore premorbid employment status. Mov Disord Official J Mov Disord Soc 2009;24:1384-7

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.