2,357
Views
2
CrossRef citations to date
0
Altmetric
Rheumatology

Cost-utility analyses of targeted immunomodulators in rheumatoid arthritis: systematic review

, , , , &
Pages 610-623 | Received 16 Oct 2019, Accepted 17 Jan 2020, Published online: 07 Feb 2020

Abstract

Aims: Cost-utility (CU) modeling is a common technique used to determine whether new treatments represent good value for money. As with any modeling exercise, findings are a direct result of methodology choices, which may vary widely. Several targeted immuno-modulators have been launched in recent years to treat moderate-to-severe rheumatoid arthritis (RA) which have been evaluated using CU methods. Our objectives were to identify common and innovative modeling choices in moderate-to-severe RA and to highlight their implications for future models in RA.

Materials and methods: A systematic literature search was conducted to identify CU models in moderate-to-severe RA published from January 2013 to June 2019. Studies must have included an active comparator and used quality-adjusted life-years (QALYs) as the common measure of effectiveness. Modeling methods were characterized by stakeholder perspective, simulation type, mapping between parameters, and data sources.

Results: Thirty-one published modeling studies were reviewed spanning 13 countries and 9 drugs, with common methodological choices and innovations observed among them. Over the evaluated time period, we observed common methods and assumptions that are becoming more prominent in the RA CU modeling landscape, including patient-level simulations, two-stage models combining trial results and real-world evidence, real-world treatment durations, long-term health consequences, and Health Assessment Questionnaire (HAQ)-related hospitalization costs. Models that consider the societal perspective are increasingly being developed as well.

Limitations: This review did not consider studies that did not report QALYs as a utility measure, models published only as conference abstracts, or cost-consequence models that did not report an incremental CU ratio.

Conclusions: CU modeling for RA increasingly reflects real-world conditions and patient experiences which are anticipated to provide better information in the assessment of health technologies. Future CU models in RA should consider applying the observed advances in modeling choices to optimize their CU predictions and simulation of real-world outcomes.

JEL CLASSIFICATION CODES:

Introduction

The British statistician George Box is famous for his quote “All models are wrong, some are useful.”Citation1 His point being that a model’s applicability to the end user is more important than whether a model’s exact answer is correct in every situation. Adherents to Box’s view might agree that the field of health economic modeling is no exception. Models that most accurately reflect the disease’s natural history, expected outcomes in the absence of intervention, and effects of alternative treatments are likely to be most reliable for decision making. This is particularly the case where the health status, treatments, and associated costs of a disease such as rheumatoid arthritis (RA) are systematically variable but quantifiable.

RA is a chronic, inflammatory, autoimmune disease that affects approximately 0.5% of the United States (US) population,Citation2 with wide variations in population characteristics, disease outcomes, and treatment response.Citation3,Citation4 The goal of treating individuals with RA is to induce clinical remission, thereby preventing further joint damage, minimizing disability, and improving quality of life. Treatments for RA include the use of disease-modifying anti-rheumatic drugs (DMARDs), classified as either conventional synthetic DMARDs (csDMARD) (methotrexate [MTX], leflunomide, sulfasalazine, azathioprine, cyclosporin and hydroxychloroquine) or targeted immuno-modulators (TIMs) (abatacept, adalimumab, baricitinib, certolizumab, etanercept, golimumab, infliximab, rituximab, tocilizumab, tofacitinib, and upadacitinib).Citation5 TIMs can be further classified as tumor necrosis factor (TNF) inhibitors, CD20-directed cytolytic antibodies, T-cell inhibitors, Interleukin-1 (IL-1) inhibitors, IL-6 inhibitors, or Janus Kinase (JAK) inhibitors. TIMs have previously been shown to significantly increase the quality of life in an RA population as compared to csDMARDs.Citation6

Upon regulatory approval, new drug treatments are commonly reviewed for their cost-utility (CU) in comparison to the standard of care in their therapeutic area. The ultimate goal of a CU review is to determine whether new treatments represent good value by comparing the clinical benefit of a new treatment to its cost relative to the standard of care, often via development and implementation of a quantitative CU model. Although CU models share fundamental similarities in their overall approach, important details can differ substantially in terms of their clinical and economic inputs, as well as their methods to arrive at a measure of value, usually expressed as the incremental cost to gain a quality adjusted life-year (QALY), which is typically compared against a country-specific value threshold or continuum of benchmarks. As a result, attempts to compare different CU analyses, even within the same therapeutic area, can be difficult due to these differences in methods.

Rheumatoid arthritis is a particularly challenging condition to model for a number of reasons. It affects a heterogeneous population with varying demographic and clinical characteristics which are associated with distinct variability in treatment responses.Citation4 Response tends to become increasingly more challenging as a function of each successive treatment failure. Although clinical guidelines do exist for treatment sequence recommendations, real-world treatment switching tends to vary considerably across patientsCitation7, making it difficult to apply simplifying assumptions that are both realistic and generalizable to an overall RA population that is made up of the distinctly different subpopulations of cs-DMARD inadequate responders, and TIMs inadequate responders. Finally, most efficacy data from clinical trials is limited in relevance to specific lines of therapy and durations of active treatment, requiring methodological modifiers and assumptions to more accurately model treatment outcomes beyond a limited timeframe.

Joensuu et al. conducted a systematic review of CU models of biologics for RA in 2015 which covered 38 models published between 2002 and 2013.Citation8 A key finding from this review was that incremental CU ratios varied widely across studies of the same treatments and sequence (e.g. €73,772 to €230,698/QALY for csDMARD-IR (inadequate response) patients switching to infliximab and €46,132 to €174,388/QALY for csDMARD-IR patients switching to etanercept). The authors of this study noted that there were significant methodological differences across studies that made it difficult to compare their results. Since the focus of their review was on summarizing model outcomes, the methodologies of individual models were only described at a high level; however, the wide variation in results suggests that careful attention be given to the model methodologies in order for results to be appropriately interpreted. In 2018, Alemao et al. reviewed RA modeling publications to inform development of their own conceptual model, which was intended to influence future modeling techniques in RA.Citation9 Although they provided general findings from their literature review, the authors did not conduct a rigorous assessment of previous modeling practices in RA, and instead focused their paper on a description of their conceptual model.

With the launch of new RA treatments such as tofacitinib, baricitinib, upadacitinib, and sarilumab,Citation10 economic models have been developed and published to analyze their CU in the context of existing and other novel treatments,Citation11–17 with the majority of these developed by either manufacturers or health technology assessment (HTA) agencies. The introduction of new RA treatments, along with their CU models, warrant an updated literature review to identify advances in economic models which could be used to inform future CU modeling in RA. The goal of this study was two-fold: (1) to conduct a systematic review to describe the structures and approaches of published models in moderate-to-severe RA, and (2) based on the findings from the systematic literature review, to highlight their implications for future models in RA.

Methods

A systematic literature search was conducted using MEDLINE (PubMed) to identify CU models in moderate-to-severe RA published from January 2013 (i.e. the time since the Joensuu 2015 analysis) to June 2019 (i.e. the time of this manuscript’s development). The strategy for identifying publications was based on the search conducted by Joensuu 2015, and included search terms describing DMARD therapies, RA disease, and study type (CU analysis). The search terms implemented by Joensuu 2015 were updated to include interventions launched since the last study included in that review, as well as keywords relating to currently-available biosimilars. A comprehensive description of the full set of search terms used for this study can be found in Appendix 1. Additionally, targeted searches of the gray literature were conducted to identify the most prominent and relevant HTAs from agencies such as the National Institute for Health and Care Excellence (NICE) (UK), Canadian Agency for Drugs and Technologies in Health (CADTH) (Canada), Medical Services Advisory Committee (MSAC) (Australia), and Institute for Clinical and Economic Review (ICER) (US), as well as any additional models that were recommended by industry experts.

The literature search and review were conducted in accordance with PRISMA, MOOSE, and CHEERS guidelines for systematic literature reviews.Citation18–20 Studies were limited to those written in the English language (or with an available English translation) and with the availability of full text. Additional inclusion criteria required studies to have an active comparison (csDMARDs or other biologics) and QALYs as the measure of effectiveness. For each study meeting our selection criteria, detailed data were extracted regarding model structures, parameter sources, and calculation techniques, such as simulation type (Markov cohort, decision tree, patient level simulation), time horizon, cycle length, methods of calculating treatment efficacy, and how treatment efficacy translates to clinical measures, utility and economic outcomes. These findings were organized and assessed to determine commonalities in methodological choices and innovations as well as model assumptions and inputs.

A total of 1,437 abstracts were reviewed for inclusion in the analysis, of which 63 abstracts were reviewed in more detail, and 41 full texts were extracted. Studies excluded from the full text summarization included CU analyses of clinical trials or registry data that did not include a modeling component, treatment-specific or otherwise non-systematic reviews of CU models in RA, a CU model exploring dosing in biosimilar etanercept,Citation21 and a conceptual model published by Alemao et al.Citation9 In addition, 10 gray literature HTAs and a standalone model were reviewed and cross-referenced with the full texts to preclude overlap. provides a complete flowchart of the study selection methodology.

Figure 1. Description of article selection process. Abbreviation. HTA: health technology assessment.

Figure 1. Description of article selection process. Abbreviation. HTA: health technology assessment.

The time period under evaluation was segmented in two periods, January 2013–December 2016 and January 2017–June 2019, to identify modeling methods in the post-Joensuu literature identification period. The end of 2016/beginning of 2017 was used as the threshold time point because: (1) it represents an approximate time midpoint, and (2) a number of therapies were approved or expect forthcoming approvals in the post-January 2017 time period (e.g. sarilumab, approved May 2017; baricitinib, approved May 2018; upadacitinib, approved August 2019).

Results

Overview

Of the 31 models included in this review, 25 studies were published in peer-reviewed journalsCitation5,Citation14–17,Citation22–39, five of them were NICE technology appraisalsCitation11–13,Citation40,Citation41, and two models were identified from the gray literature.Citation42,Citation43 provides an overview of each profiled publication. Although a review of one of the NICE appraisalsCitation11 was published in a peer-reviewed journal, for the purposes of this analysis, it is counted among the NICE HTAs rather than a standalone modeling study.

Table 1. Summary table of profiled CU models in RA.

With regard to geography, the majority of models were conducted from the US or UK perspectives, including nine UK modelsCitation5,Citation11–13,Citation22–24,Citation40,Citation41 (includes five NICE HTAsCitation11–13,Citation40,Citation41) and nine US models.Citation14,Citation15,Citation25–29,Citation42,Citation43 Of the US models, five of them were sponsored by pharmaceutical companies, two of them by academic consortia, and one each was sponsored by an HTA body or an analytics company. Among the other countries, Greece,Citation16,Citation30 and South KoreaCitation31,Citation32 had two models each, while Colombia,Citation33 GermanyCitation34, Iran,Citation35 Italy,Citation36 the Netherlands,Citation37 Norway,Citation38 Serbia,Citation39 Spain,Citation17 and TaiwanCitation44 had one model each.

All models focused on evaluating the CU of one or more new or previously-introduced interventions, with four models of tocilizumab,Citation22,Citation29,Citation34,Citation38 four of tofacitinib,Citation13,Citation14,Citation16,Citation17 three each of sarilumab,Citation12,Citation15,Citation26 and certolizumab,Citation30,Citation39,Citation41 two of adalimumabCitation5,Citation33 and etanercept,Citation27,Citation31 and one each of abatacept,Citation41 baricitinib,Citation11 and rituximab.Citation35 Six studies evaluated specific sequences of TIMs,Citation23,Citation27,Citation28,Citation36,Citation41,Citation43 while four studies each evaluated multiple drugs within a class.

Among the studies reviewed, we observed common modeling methods, common model assumptions and input parameters, and other methodological innovations across published CU models, each of which is described in greater detail in separate sections below. describes changes over time for two of the common categories of methodological choices, namely model structure types and data sources for long-term treatment durations.

Figure 2. Observations in CU modeling for RA, 2013–2019. Abbreviations. CU: cost-utility utility; RA: rheumatoid arthritis; RWE: real-world evidence.

Figure 2. Observations in CU modeling for RA, 2013–2019. Abbreviations. CU: cost-utility utility; RA: rheumatoid arthritis; RWE: real-world evidence.

Common methodological choices

Model structure: simulation type

Cohort, patient-level simulation, and two-stage models are three frameworks available to the modeler which use different methods to account for patient’s transition between different health states. Framework selection is therefore a key factor in estimating the effectiveness of a given treatment over time. Of the 31 models included in this systematic review, 20 (65%) included a patient-level simulation component, with 11 (35%) structured with a single structureCitation5,Citation14,Citation17,Citation22,Citation25,Citation29,Citation34–36,Citation41,Citation43 and 9 (23%) structured as two-stage simulation models.Citation11–13,Citation15,Citation23,Citation24,Citation26,Citation27,Citation40 The remaining 11 (35%) models were set up as Markov cohort models.Citation16,Citation27,Citation30–34,Citation37–39,Citation41,Citation42 A higher proportion of CU models published from January 2017 to June 2019 were patient-level simulation models (80%) compared to models published from January 2013 to December 2016 (50%), suggesting an increasing preference for this approach. One commonly-referenced patient-level simulation was the model used by NICE in its 2016 review of all RA therapies available at the time of analysis,Citation40 which since its publication has been adopted by researchers both affiliatedCitation24 and unaffiliatedCitation15,Citation27 with the authors of the NICE review. This model took a discrete event simulation approach without a fixed cycle length after the initial 6-month evaluation of treatment response, and applied real-world treatment durations and sequences to evaluate the treatment pathways experienced by patients.

One variation on the patient-level simulation has been the two-stage simulation model, which has been increasingly adopted in more recent modeling studies (8 of 15 from January 2017 to June 2019Citation11–13,Citation15,Citation23,Citation24,Citation26,Citation27 vs. 1 of 15 from January 2013 to December 2016Citation40). In these models, a short-term structure was used to estimate initial response to treatment, typically for the first 6 months of the model, with response rates sourced from clinical trial data. Subsequent transitions such as treatment switching were simulated via transition probabilities based on analyses from long-term registries such as the British Society for Rheumatology Biologics Registers (BSRBR) database,Citation11,Citation22,Citation24,Citation41 the Canadian RHUMADATA registry,Citation12,Citation15,Citation26 the Swiss Clinical Quality Management in Rheumatic Diseases (SCQM-RA) cohort,Citation22,Citation31 the NOR-DMARD database,Citation37 the Consortium of Rheumatology Researchers of North America (CORRONA) database,Citation43 or combined BSRBR and CORRONA data.Citation27

Long-term treatment duration

Seventeen published CU models applied clinical trial-derived treatment durations to all model cycles beyond the clinical trial duration (i.e. typically 6 months), thereby assuming that treatment response behaved the same whether in the short or long term.Citation5,Citation13,Citation14,Citation16,Citation17,Citation24,Citation25,Citation29,Citation30, Citation32–35,Citation39,Citation41,Citation42 A higher proportion of CU models published from January 2017 to June 2019 (53%) used data from real world experience either by itself or in combination with clinical trial data to estimate transition probabilities compared to models published from January 2013 to December 2016 (25%), suggesting increasing preference for this approach.

Common model assumptions and input parameters

Long-term treatment efficacy assumptions

As with long-term treatment duration, long-term treatment efficacy is not reliably sourced from short-term clinical trials. Because of the central role that the HAQ scale plays in translating patient outcomes and treatment efficacy into costs and quality of life, the majority of models simulated long-term treatment effect via improvements (decreases) or deterioration (increases) in HAQ score. The majority of models separated long-term HAQ changes by treatment category (separated out as biologics including JAK inhibitors, cDMARDs, and palliative care), as long-term registry studies have not shown significant differences between individual treatments.

Twenty-five models applied separate long-term HAQ adjustments for biologics and cDMARDs, with seven of them further separating palliative care from intensive cDMARD treatment. Of these 25 models, seventeen of them hold HAQ constant for patients on biologics while applying a HAQ deterioration factor for cDMARDs and palliative care. Three models applied a HAQ improvement for biologics along with HAQ deterioration for cDMARDs, and two models assumed that all patients experienced HAQ deterioration, but patients on biologics deteriorated more slowly than patients on cDMARDs. Three models were built with structures that linked HAQ transitions to short-term efficacy calculations. Of the six remaining models that did not apply long-term HAQ adjustments based on treatment choice, two of them assumed no improvement or deterioration (and therefore no change in HAQ) in the long term for all patients regardless of treatment, and four did not address long-term efficacy in any form. Over time, newer models tended to hold HAQ constant for bDMARDs while applying a HAQ deterioration factor for cDMARDs and palliative care, with 73% of the 15 CU models published from January 2017 to June 2019 taking this approach compared to 38% of the 16 models published from January 2013 to December 2016.

Costs of hospitalizations

All 31 models incorporated direct costs of treatment, which includes drug acquisition costs, administration costs for infused or injected treatments, and regular monitoring costs for treatments that required them. In addition to treatment-related costs, the majority of models incorporated the costs of hospitalizations due to RA-related complications. Twenty models explicitly modeled the cost of hospitalization due to RA as a distinct category of care cost, using calculations that incorporated HAQ score to link disease severity and need for hospitalization (and its associated costs). Thirteen of these models used HAQ-linked hospitalization days from real-world evidence (RWE) analyses, while six others applied regression models based on other registries that also incorporated patient demographics, and one study did not specify methodology. Of the remaining eleven models, six of them instead applied aggregate care costs linked to HAQ that did not explicitly include hospitalization, while five models did not include hospitalization costs beyond those linked to treatment administration. Over time, newer models tended to increasingly include HAQ-linked hospitalization costs, with 80% of the 15 CU models published from January 2017 to June 2019 including HAQ-linked hospitalization costs compared to 50% of the 16 models published from January 2013 to December 2016.

Excess mortality due to RA

Although most studies associated RA with an increased mortality risk relative to general population mortality, there were differences in how such an increased risk was implemented. The most common approach, used by thirteen models, applied a multiplier of a constant value raised to the power of a patient’s HAQ score, most commonly 1.33 ̂ HAQ, which was used by twelve of these models, with the other one applying 1.46 ̂ HAQ. Seven models applied mortality multipliers linked to specific HAQ ranges; this approach was made popular by the NICE evaluation from Stevenson et al. in 2016 and was subsequently used by the later NICE evaluation and other studies inspired by them. Four models applied a constant risk ratio for all RA relative to background mortality, but did not differentiate between patients of differing disease severity. Of the remaining seven models, two used a combined approach to separately determine HAQ-related mortality at baseline and after treatment, one applied an exponential function of HAQ and age, three did not apply any disease-related mortality, and one study did not describe how it handled mortality. Over time, the most common approach of a constant HAQ multiplier remained common in both time periods evaluated, making up 40% of the 15 CU models published from January 2017 to June 2019 compared to 44% of the 16 models published from January 2013 to December 2016. In that earlier time period, only one study applied mortality multipliers linked to specific baseline HAQ ranges, namely the Stevenson et al. 2016 NICE evaluation, and the six other models that used this approach were published between January 2017 and June 2019.

Adverse events

There is a roughly even split between models which did or did not include AEs in their cost and utility calculations, with a total of seventeen models excluding AEs from the analysis. Of the remainder, 13 models included infections, either unspecified or focusing on pneumonia or tuberculosis, while one included treatment-related toxicity and stratified risks by age. The inclusion of AEs has been fairly constant over time, with 44% of the 15 CU models published from January 2017 to June 2019 including AEs compared to 47% of the 16 models published from January 2013 to December 2016.

Other methodological innovations

Mixture model utility calculations

In addition to the observations mentioned above, a number of new analytical approaches have been introduced in recent years. For instance, new methods have arisen to estimate health utilities in the RA patient population. CU models in RA typically estimate QALYs via mapping from the Health Assessment Questionnaire (HAQ) score, a functional status measure that is commonly used in RA treatment and can in turn be calculated from treatment response. Since HAQ results are the single endpoint used to define clinical effectiveness to calculate QALY’s, complete and reliable methods to collect these data are crucial. Ideally, HAQ scores as measured directly from the populations of interest would be used for this calculation. However, the absence of HAQ data in some clinical trials has left modelers to estimate HAQ results from other available data such as ACR or EULAR scores. This introduces a source of potential error and variability. Fortunately, HAQ scores are being reported more consistently in recent clinical trials and in disease registries which should minimize the need to estimate HAQ data from other endpoints.

Historically, the next step of mapping HAQ to QALYs used a linear or quadratic equation that, although straightforward to calculate, was simplistic in taking only HAQ as an input to generate a QALY value that, if measured directly via a patient survey, would require addressing multiple dimensions of wellbeing.Citation45 Although some studies also used patient demographics in their HAQ to QALY calculations,Citation14,Citation17,Citation31 they still did not bring in additional measures of patient well-being. Calculated QALYs were found to more accurately match reality if they incorporated pain measurements as well as HAQ scores, and using such an approach would improve the predictive value of a CU model that included it.Citation40

Initially published in 2012, the Hernandez Alava et al. mixture model is designed with a goal of better handling the heterogeneity of EQ-5D health utility components typically used to calculate QALYs.Citation46 The Hernandez Alava et al. mixture model, which was piloted on a clinical trial data set of 467 patients randomized across four RA treatments and later validated with a large American databank with over 100,000 observations from over 16,000 patients,Citation47 has become commonly used in UK HTAs and other published models. Three of the five NICE HTAsCitation11,Citation13,Citation40 and six of the 26 other modelsCitation22,Citation23,Citation28,Citation29,Citation36,Citation43 that were identified in our review have applied the Hernandez Alava et al. model to calculate health utilities.

Societal perspective

Although most of the models (19) took a payer-only perspective,Citation11–17,Citation22,Citation23,Citation25,Citation26,Citation29–31,Citation34,Citation39–41 12 models took a societal perspective into account,Citation5,Citation24,Citation27,Citation28,Citation32,Citation33,Citation35–38,Citation42,Citation43 with nine of these 12 focusing only on productivity losses and no other indirect costs.Citation5,Citation24,Citation27,Citation28,Citation36,Citation37,Citation42,Citation43 One each considered a combination of productivity losses and transportation burden,Citation38 assistive services and transportation burden, devices,Citation35 or an unspecified basket of indirect costs.Citation32

Discussion

Implications for future CU modeling in RA

We performed a systematic literature search of CU models in moderate-to-severe RA published from January 2013 to June 2019, and included 31 original articles in the current review. This paper observed a set of common methods and assumptions that are becoming more prominent in the RA CU modeling landscape, as well as some innovative approaches that stand out as possible elements for future models. Based on observations from the 31 models published in the past 6 years, it is our position that CU models of the future should incorporate the increasingly-common methodological choices observed in recent years, such as patient-level simulations, two-stage models combining trial results and RWE, real-world treatment durations, long-term HAQ degradation for cDMARD patients, HAQ-related hospitalization costs, and HAQ-stratified mortality calculations, as well as newer innovations in more accurate utility models and the societal perspective.

Although Markov models have been historically common in CU modeling thanks to their relative ease of implementation and calculation, the limitations of their simplifying assumptions have provided an opportunity for more advanced modeling methods to supersede them, especially in light of modern advances in data analysis and quantitative simulation. Due to health states in Markov models only being able to simulate homogeneous cohorts, there are limits to how accurately Markov models can simulate heterogeneous populations, resulting in a potentially biased analysis.Citation4 This is an especially important consideration for RA, which is a disease affecting large and heterogeneous populations which experience periods of disease flare and relative quiescence, and also respond differently to individual treatments.Citation4 In practice, cohort models cannot fully account for this heterogeneity, even with separate subgroup or scenario analyses, and therefore take a crude approach to estimating the CU of RA treatments.

Patient-level simulations, as the name suggests, simulate diseases and treatments at the level of individual patients rather than at the cohort level. This ability to focus on the individual unit of simulation allows for greater flexibility in handling population heterogeneity, more accurately aligning a simulated population to a real-world population rather than applying simplifying assumptions to an average cohort as in a Markov model. With this increased flexibility comes a need for advanced computing techniques to ultimately implement the simulation.

RA models that incorporated patient-level simulations noted several advantages to this approach. They include the ability to capture heterogeneous characteristics of the RA population, which manifests itself in demographic and treatment history differences across patients, and leads to variability in treatment outcomes.Citation48 Their increasing adoption by NICE and other modeling entities has been considered a validation of their approach.Citation12,Citation27,Citation41 Finally, they have shown a superior ability to incorporate treatment history in calculations of outcomes, which is an important consideration given that treatment switching is a common practice in managing RA.Citation49 The primary advantage of this structure includes the dual use of clinical trial results and RWE, such that long-term treatment effectiveness and treatment switching are not based on short-term clinical trial results derived from a controlled environment but rather appropriately sourced from longer-term RWE.

Given that most models tend to simulate the remaining lifetime of a patient or patient cohort, the assumption that patients will continue to respond to and stay on treatment based on short-term clinical trial results is an unrealistic one. Artificially assigning treatment discontinuation from trials, which are likely to be a lower probability than in the real world, allows patients to continue experiencing a treatment response; the impact of such an approach is that patients will continue to benefit from the treatment effect, thus inflating the avoidance of costs (i.e. via resources avoided) and health-related quality of life. As shown by the more frequent use of real-world treatment durations in newer models, modelers are becoming increasingly aware of the need to use long-term data to extrapolate long-term behavior.

Along with the move to separating short- and long-term treatment duration, there is a move to incorporate separate long-term treatment efficacy using long-term evidence rather than extrapolating from short-term clinical trial data. The RWE studies referenced by newer models have acknowledged differences in HAQ progression across treatment categories in the long run, with change in HAQ tapering off and eventually holding constant. Models in RA have reflected this understanding with a large proportion of models in the past 3 years applying different HAQ deterioration factors for biologics and cDMARDs, and centering around an approach where HAQ is held constant for patients on biologics, while patients on cDMARDs slowly deteriorate.

Improvements in long-term tracking of treatment duration and efficacy have also been translated into outcome-linked tracking of costs and outcomes, as seen with increased tracking of hospitalization costs and mortality risks linked to HAQ score. In more recent models, the most common approach has been to apply RWE-sourced hospitalization costs based on HAQ ranges. Similarly, newer models have almost exclusively incorporated increased RA mortality as a function of HAQ score, with older approaches based on flat RA-based mortality risks or a lack of additional RA mortality risks going by the wayside. With regard to hospitalization and mortality, although there is some variation on specific algorithm chosen for each model, the underlying premise of linking costs and risks to HAQ remains the new standard approach.

The Hernandez Alava et al. mixture model has become increasingly common as a method to calculate patient utilities from HAQ measures, patient characteristics, and additional wellness indicators in RA models. The authors of the Hernandez Alava et al. model argue that the chosen mapping equations of HAQ to utilities used in previous models are too simplistic for accurate mapping, and their model has been designed to make up for those shortcomings by incorporating more patient characteristics, such as pain scores, that are major contributors to accurate QALY calculation. Studies that applied this mixture model note a number of additional factors that make it superior to previously-published utility models. First, it is based on a large RWE data set based on real-world clinical practice rather than a more limited clinical trial data set restricted by its inclusion criteria. Second, it shows improved predictive value by allowing more of the EQ-5D components to be linked to mixture model inputs, which improves the model’s ability to accurately handle patient heterogeneity. Finally, it incorporates a pain score component, which has been shown to be a significantly correlated component of EQ-5D.

To illustrate the importance of selecting the correct health utility model, we used the IVI modelCitation43 to compare the Hernandez Alava et al. mixture model against the Wailoo et al. logistic regression model, an older mapping algorithmCitation50 that has also been used in some recent RA models.Citation42,Citation43 Keeping all other model assumptions and inputs at their default values, and comparing a targeted DMARD sequence against a csDMARD sequence, we found a 17% difference in incremental QALYs (3.19 with Hernandez Alava et al. 2.73 with Wailoo et al.). This results in incremental CU ratios of approximately $131,000/QALY with Hernandez Alava et al. and $152,000 with Wailoo et al., suggesting that the choice of utility model can have a significant impact on CU outcomes. Our observations and recommendations on effectiveness are centered on health utility estimation, and thus CU analysis, based on the widespread use of QALYs among RA models and the ease to which methods comparisons across this model type could be made. However, it is important to note that QALYs are not a clinical concept, but rather a health economic one, and that alternative measures of effectiveness and cost-effectiveness should be considered for future implementation based on the targeted stakeholder (e.g. alternate clinical outcomes relative to cost differences).

The choice to include indirect costs is aligned with the recommendations of the Second Panel on Cost-Effectiveness in Health and Medicine, a group of experts in CU research who published a set of guidelines in 2016 for best practices in CU modeling.Citation51 In particular, the Second Panel stressed the importance of including a dual reference case incorporating the societal perspective that goes beyond the direct costs relevant to payers. The Second Panel also noted that indirect costs of caregiving and transportation should be included to provide a more complete picture of the burden of illness, in addition to the usual productivity losses that are commonly included in models with societal perspectives.

Comparisons to previous observations and future perspectives

The observations noted in this study provide a contrast to the findings of Joensuu et al.’s prior review of RA modeling, as a number of methodological aspects mentioned in their findings have been further refined in the years since then. More recently, a forward-looking modeling framework for RA from Alemao et al. described aspirational considerations for future RA models, and a number of their recommendations are aligned with our findings regarding methodological choices in recent models.

Although the Joensuu review was primarily focused on population choices and model results, the authors noted that methodological differences were a key obstacle to their efforts to compare results across CU analyses. In particular, they note unmet need for a validated and standardized method of converting utilities from HAQ and other disease-specific measures. In addition, the authors note that reliance on RCTs as the sole source of treatment efficacy is not realistic when applied to regular clinical practice, and that efficacy estimates based on observational data are more relevant to models that aim to provide a realistic simulation of real-world outcomes, which is similar to our observations. They also note that indirect costs, although valuable for providing a broader perspective into the potential functional benefits of biologics, are tied to the biases of the country whose perspective they are applied to, and therefore should be excluded when comparing CU analyses across countries, which is contrary to the recommendations of the Second Panel.

Alemao 2018Citation9 developed a conceptual framework for CU modeling in RA based on a systematic literature review of existing decision-analytic models as well as an analysis of an RA registry. Due to its conceptual nature, the Alemao model is described on a more abstract level than the fully-implemented models, but it is closely aligned with common methodological choices in recent RA models. The proposed framework of the conceptual model centers on a patient-level simulation, as is increasingly common in RA models, while extending this approach to incorporate multiple subpopulations. It would rely on real-world evidence sources for treatment switching, much like many recent RA models, and also include switching between multiple doses of the same treatment, as occurs in real-world clinical practice. Finally, it would map utilities to multiple measures of disease activity and patient characteristics for maximum predictive accuracy in the vein of the Hernandez Alava et al. mixture model and beyond. Overall, the Alemao conceptual model is well-rooted in recent methodological choices in modeling, even if, as the authors acknowledge, it may be some time before data availability makes it possible for its more ambitious aspirations to be made real.

Study limitations

There were certain limitations with regards to our literature search and identification. Our study did not consider CU models that did not report QALYs as an effectiveness measure, CU models that were published only as conference abstracts, or cost-consequence models that did not report an incremental CU ratio and instead only reported costs and consequences separately. Differences in healthcare systems and cost structures limit the comparability of studies set in different countries, even when populations and treatments may be comparable. In addition, the differences in comparator selection and treatment sequencing among the studies prevented meaningful comparisons of model outputs across studies. These limitations may have excluded important economic models for consideration, and the corresponding identification of additional methodological choices and innovations.

Conclusion

Since 2013, new treatments for RA have been accompanied by updated published economic models to describe their comparative value among the treatment landscape. Developments in the state-of-the-art of economic modeling have shown increased uptake of simulation of heterogeneous real-world populations via patient-level simulations and increased use of real-world treatment behavior based on treatment durations from registries. In addition, we have observed common application of HAQ-related hospitalization costs, long-term HAQ degradation for cDMARD patients, and HAQ-stratified mortality calculations, more accurate modeling of treatment outcomes using utility models that better fit real-world data and incorporate additional key covariates for improved predictive value, and consideration of indirect costs. These choices regarding model structure and assumptions should be considered when building or reviewing RA models of the future.

Future CU modeling efforts in RA should apply the aforementioned methodological choices to optimize their simulation of real-world outcomes and likelihood of accurate prediction of CU, and a model that applies more outdated methodological choices risks diminishing the validity of its findings. As CU modeling becomes increasingly used by decision makers in the US as well as in countries with established HTA bodies, increased uptake of these approaches will also improve comparability across CU studies and thus lead to consistency across determinations of an intervention’s economic value.

Transparency

Declaration of funding

This study was funded by AbbVie.

Declaration of financial/other interests

Matthew Sussman, Charles Tao, and Joseph Menzin are employees of Boston Health Economics, LLC and were paid consultants in connection with the study. Pankaj Patel, Namita Tundia, and Jerry Clewell are employees and shareholders of the study sponsor.

JME peer reviewers on this manuscript have received an honorarium from JME for their review work, but have no other relevant financial relationships to disclose.

All authors had access to the data results, and participated in the development, review, and approval of this manuscript.

Author contributions

All authors were involved in the conception, design, analysis, and interpretation of the data. All authors were responsible for development, reviewing and revising the paper for intellectual content. All authors approved the final version to be published and agree to be accountable for all aspects of the work.

Acknowledgements

Medical writing assistance was provided by Nicholas Adair, MS of Boston Health Economics, LLC., and was funded by AbbVie.

References

  • Wasserstein R. George Box: a model statistician. Significance. 2010;7(3):134–135.
  • Hunter TM, Boytsov NN, Zhang X, et al. Prevalence of rheumatoid arthritis in the United States adult population in healthcare claims databases, 2004–2014. Rheumatol Int. 2017;37(9):1551–1557.
  • Dadoun S, Zeboulon-Ktorza N, Combescure C, et al. Mortality in rheumatoid arthritis over the last fifty years: systematic review and meta-analysis. Joint Bone Spine. 2013;80(1):29–33. Epub 2012 Mar 27. Review.
  • Abdel-Nasser AM, Rasker JJ, Vaikenburg HA. Epidemiological and clinical aspects relating to the variability of rheumatoid arthritis. Semin Arthritis Rheum. 1997;27(2):123–140.
  • Diamantopoulos A, Finckh A, Huizinga T, et al. Tocilizumab in the treatment of rheumatoid arthritis: a cost-effectiveness analysis in the UK. Pharmacoeconomics. 2014;32(8):775–787.
  • Boyadzieva VV, Stoilov N, Stoilov RM, et al. Quality of life and cost study of rheumatoid arthritis therapy with biological medicines. Front Pharmacol. 2018;9:794.
  • Smolen JS, Aletaha D. Rheumatoid arthritis therapy reappraisal: strategies, opportunities and challenges. Nat Rev Rheumatol. 2015;11(5):276–289.
  • Joensuu JT, Huoponen S, Aaltonen KJ, et al. The cost-effectiveness of biologics for the treatment of rheumatoid arthritis: a systematic review. PLoS One. 2015;10(3):e0119683.
  • Alemao E, Al MJ, Boonen AA, et al. Conceptual model for the health technology assessment of current and novel interventions in rheumatoid arthritis. PLoS One. 2018;13(10):e0205013.
  • FDA Approved Drugs for Rheumatology | Centerwatch. [cited 2019 May 29]. Available from: https://www.centerwatch.com/drug-information/fda-approved-drugs/therapeutic-area/19/rheumatology.
  • National Institute for Health and Care Excellence 2017. Baricitinib for moderate to severe rheumatoid arthritis (NICE technology appraisal guideance TA 466); [cited 2019 May 29]. Available from: https://www.nice.org.uk/guidance/ta466/
  • National Institute for Health and Care Excellence 2017. Sarilumab for moderate to severe rheumatoid arthritis (NICE technology appraisal guideance TA 485). [cited 2019 May 29]. Available from: https://www.nice.org.uk/guidance/ta485/
  • National Institute for Health and Care Excellence 2017. Tofacitinib for moderate to severe rheumatoid arthritis (NICE technology appraisal guideance TA 480). [cited 2019 May 29]. Available from: https://www.nice.org.uk/guidance/ta480/
  • Carlson JJ, Ogale S, Dejonckheere F, et al. Economic evaluation of tocilizumab monotherapy compared to adalimumab monotherapy in the treatment of severe active rheumatoid arthritis. Value Health. 2015;18(2):173–179.
  • Jansen JP, Incerti D, Mutebi A, et al. Cost-effectiveness of sequenced treatment of rheumatoid arthritis with targeted immune modulators. J Med Econ. 2017;20(7):703–714.
  • Tzanetakos C, Tzioufas A, Goules A, et al. Cost-utility analysis of certolizumab pegol in combination with methotrexate in patients with moderate-to-severe active rheumatoid arthritis in Greece. Rheumatol Int. 2017;37(9):1441–1452.
  • Hidalgo-Vega Á, Villoro R, Blasco JA, et al. Cost-utility analysis of certolizumab pegol versus alternative tumour necrosis factor inhibitors available for the treatment of moderate-to-severe active rheumatoid arthritis in Spain. Cost Eff Resour Alloc. 2015;13(1):11.
  • Moher D, Liberati A, Tetzlaff J, The PRISMA Group, et al. Preferred Reporting Items for Systematic Reviews and MetaAnalyses: The PRISMA Statement. PLoS Med. 2009;6(7):e1000097.,
  • Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283(15):2008–2012.
  • Husereau D, Drummond M, Petrou S, et al. Consolidated Health Economic Evaluation Reporting Standards (CHEERS) – explanation and elaboration: a report of the ISPOR Health Economic Evaluation Publication Guidelines Good Reporting Practices Task Force. Value Health. 2013;16(2):231–250.
  • Wu B, Song Y, Leng L, et al. Treatment of moderate rheumatoid arthritis with different strategies in a health resource-limited setting: a cost-effectiveness analysis in the era of biosimilars. Clin Exp Rheumatol. 2015;33(1):20–26.
  • Alemao E, Johal S, Al MJ, et al. Cost-effectiveness analysis of abatacept compared with adalimumab on background methotrexate in biologic-naive adult patients with rheumatoid arthritis and poor prognosis. Value Health. 2018;21(2):193–202.
  • Stephens S, Botteman MF, Cifaldi MA, et al. Modelling the cost-effectiveness of combination therapy for early, rapidly progressing rheumatoid arthritis by simulating the reversible and irreversible effects of the disease. BMJ Open. 2015;5(6):e006560–e006560.
  • Stevenson MD, Wailoo AJ, Tosh JC, et al. The cost-effectiveness of sequences of biological disease-modifying antirheumatic drug treatment in England for patients with rheumatoid arthritis who can tolerate methotrexate. J Rheumatol. 2017;44(7):973–980.
  • Bansback N, Phibbs CS, Sun H, et al. Triple therapy versus biologic therapy for active rheumatoid arthritis: a cost-effectiveness analysis. Ann Intern Med. 2017;167(1):8–16.
  • Claxton L, Taylor M, Gerber RA, et al. Modelling the cost-effectiveness of tofacitinib for the treatment of rheumatoid arthritis in the United States. Curr Med Res Opin. 2018;34(11):1991–2000. Nov
  • Incerti D, Jansen JP. A description of the IVI-RA model. 2017 [updated October 2017]. Accessible from https://innovationvalueinitiative.github.io/IVI-RA/model-description/model-description.pdf
  • Jalal H, O’Dell JR, Bridges SL, Jr, et al. Cost-effectiveness of triple therapy versus etanercept plus methotrexate in early aggressive rheumatoid arthritis. Arthritis Care Res (Hoboken). 2016;68(12):1751–1757.
  • Muszbek N, Proudfoot C, Fournier M, et al. Economic evaluation of sarilumab in the treatment of adult patients with moderately-to-severely active rheumatoid arthritis who have an inadequate response to conventional synthetic disease-modifying antirheumatic drugs. Adv Ther. 2019;36(6):1337–1357.
  • Athanasakis K, Tarantilis F, Tsalapati K, et al. Cost-utility analysis of tocilizumab monotherapy in first line versus standard of care for the treatment of rheumatoid arthritis in Greece. Rheumatol Int. 2015;35(9):1489–1495. Epub 2015 Mar 21.
  • Lee MY, Park SK, Park SY, et al. Cost-effectiveness of tofacitinib in the treatment of moderate to severe rheumatoid arthritis in South Korea. Clin Ther. 2015;37(8):1662–1676.e2.
  • Park SK, Park SH, Lee MY, et al. Cost-effectiveness analysis of treatment sequence initiating with etanercept compared with leflunomide in rheumatoid arthritis: impact of reduced etanercept cost with patent expiration in South Korea. Clin Ther. 2016;38(11):2430–2446.e3.
  • Valle-Mercado C, Cubides MF, Parra-Torrado M, et al. Cost-effectiveness of biological therapy compared with methotrexate in the treatment for rheumatoid arthritis in Colombia. Rheumatol Int. 2013;33(12):2993–2997.
  • Gissel C, Götz G, Repp H. Cost-effectiveness of adalimumab for rheumatoid arthritis in Germany. Z Rheumatol. 2016;75(10):1006–1015. Dec
  • Hashemi-Meshkini A, Nikfar S, Glaser E, et al. Cost-effectiveness analysis of tocilizumab in comparison with infliximab in Iranian rheumatoid arthritis patients with inadequate response to tDMARDs: a multistage Markov model. Value Health Reg Issues. 2016;9:42–48.
  • Quartuccio L, di Bidino R, Ruggeri M, et al. Cost-effectiveness analysis of two rituximab retreatment regimens for longstanding rheumatoid arthritis. Arthritis Care Res (Hoboken). 2015;67(7):947–955.
  • Tran-Duy A, Boonen A, Kievit W, et al. Modelling outcomes of complex treatment strategies following a clinical guideline for treatment decisions in patients with rheumatoid arthritis. Pharmacoeconomics. 2014;32(10):1015–1028.
  • Kvamme MK, Lie E, Uhlig T, et al. Cost-effectiveness of TNF inhibitors vs synthetic disease-modifying antirheumatic drugs in patients with rheumatoid arthritis: a Markov model study based on two longitudinal observational studies. Rheumatology (Oxford). 2015;54(7):1226–1235.
  • Kostić M, Jovanović S, Tomović M, et al. Cost-effectiveness analysis of tocilizumab in combination with methotrexate for rheumatoid arthritis: a Markov model based on data from Serbia, country in socioeconomic transition. Vojnosanit Pregl. 2014;71(2):144–148. Feb
  • Stevenson M, Archer R, Tosh J, et al. Adalimumab, etanercept, infliximab, certolizumab pegol, golimumab, tocilizumab and abatacept for the treatment of rheumatoid arthritis not previously treated with disease-modifying antirheumatic drugs and after the failure of conventional disease-modifying antirheumatic drugs only: systematic review and economic evaluation. Health Technol Assess. 2016;20(35):1–610.
  • Bermejo I, Stevenson M, Archer R, et al. Certolizumab pegol for treating rheumatoid arthritis following inadequate response to a TNF-α inhibitor: an evidence review group perspective of a NICE single technology appraisal. Pharmacoeconomics. 2017;35(11):1141–1151.
  • Fournier M, Chen CI, Kuznik A, et al. Sarilumab monotherapy compared with adalimumab monotherapy for the treatment of moderately to severely active rheumatoid arthritis: an analysis of incremental cost per effectively treated patient. CEOR. 2019;11:117–128.
  • Rheumatoid Arthritis: Final Report – ICER. [updated 2017 April 7; cited 2019 May 29]. Accessible from: https://icer-review.org/material/ra-final-report/
  • Chen DY, Hsu PN, Tang CH, et al. Tofacitinib in the treatment of moderate-to-severe rheumatoid arthritis: a cost-effectiveness analysis compared with adalimumab in Taiwan. J Med Econ. 2019;13:1–11.
  • Shaw JW, Johnson JA, Coons SJ. US valuation of the EQ-5D health states: development and testing of the D1 valuation model. Med Care. 2005;43(3):203–220.
  • Hernández Alava M, Wailoo AJ, Ara R. Tails from the peak district: adjusted limited dependent variable mixture models of EQ-5D questionnaire health state utility values. Value Health. 2012;15(3):550–561.
  • Hernández Alava M, Wailoo AJ, Wolfe F, et al. The relationship between EQ-5D, HAQ and pain in patients with rheumatoid arthritis. Rheumatology. 2013;52(5):944–950.
  • Shafrin J, Hou N, Tebeka MG, et al. Economic burden of rheumatoid arthritis is higher for ACPA-positive patients. Paper presented at: 2016 American College of Rheumatology/Association of Rheumatology Health Professionals Annual Meeting; 2016 November 11–16, Abstract 2229; Washington, DC.
  • Wei W, Knapp K, Wang L, et al. Treatment persistence and clinical outcomes of tumor necrosis factor inhibitor cycling or switching to a new mechanism of action therapy: real-world observational study of rheumatoid arthritis patients in the United States with prior tumor necrosis factor inhibitor therapy. Adv Ther. 2017;34(8):1936–1952.
  • Wailoo AJ, Bansback N, Brennan A, et al. Biologic drugs for rheumatoid arthritis in the Medicare program: a cost-effectiveness analysis. Arthritis Rheum. 2008;58(4):939–946.
  • Sanders GD, Neumann PJ, Basu A, et al. Recommendations of the second panel on cost-effectiveness in health and medicine. JAMA. 2016; 316(10):1093–1103.

Appendix 1. SEARCH terms

Search terms based on Joensuu search (2019 additions in red)

(((adalimumab OR humira OR etanercept OR enbrel OR rituximab OR (rituxan) OR mabthera OR infliximab OR remicade OR anakinra OR kineret OR abatacept OR orencia OR tocilizumab OR roactemra OR actemra OR golimumab OR simponi OR certolizumab OR certolizumab pegol OR certolizumab pegol OR cimzia OR TNF OR tumor necrosis factor OR tumor necrosis factor OR tumor necrosis factor alpha OR tnf alpha OR tnfalpha OR anti-tnf OR anti tnf OR anti tumor necrosis factor OR anti tumor necrosis factor alpha OR antitumor necrosis factor OR tnf blocker OR tnf blocker OR tumor necrosis factor blocker OR tnf alpha blocker OR tumor necrosis factor alpha blocker OR biologics OR biological agent OR biologic agent OR interleukin 1 receptor antagonist protein OR interleukin 1 receptor antagonist OR monoclonal antibodies OR biological therapy OR tofacitinib OR xeljanz OR upadacitinib OR sarilumab OR kevzara OR filgotinib OR baricitinib OR olumiant OR biosimilar OR inflectra)) AND (rheumatic diseases OR arthritis, rheumatoid)) AND (((((cost-benefit)) OR (economic) OR (cost-effectiveness) OR (cost effectiveness) OR (cost effectiveness) OR (cost-utility) OR cost utility OR (cost utility) OR (quality-adjusted life years) OR (quality adjusted lifeyears) OR qaly OR (cost benefit analysis) OR (cost benefit analysis) OR (cost effectiveness analysis) OR (cost effectiveness analysis) OR (cost effectiveness analysis))) OR ((quality of life) OR (utility) OR (health related quality of life) OR (hrqol)))

Appendix 2. primer on CU modeling and hta

Cost-utility (CU) reviews by health technology assessment (HTA) organizations typically include a CU model, which is a mathematical simulation of the clinical and economic consequences of treating a disease. CU models are typically conducted to extrapolate clinical outcomes beyond the duration of clinical trials, to include relevant treatment comparators, and to evaluate economic costs associated with the corresponding clinical outcomes. Head-to-head clinical studies may be used when available, but in the absence of head-to-head trials, network meta-analytic techniques are often employed to estimate comparative clinical effectiveness.

Until recently, CU models have been primarily used by HTA organizations outside the US. HTA organizations affiliated with national health systems, such as the National Institute for Health and Care Excellence (NICE) in the United Kingdom (UK) and the Canadian Agency for Drugs and Technologies in Health (CADTH), routinely apply the results of in-house or externally-sourced CU models to inform their recommendations on health technology reimbursement decisions. NICE especially has considerable clout in defining the direction of CU methodology, and its methodological preferences can be seen in models published long after the first NICE evaluation that established them.

In recent years, the rising costs of healthcare have prompted greater interest in formal CU evaluations that can be used by US payers. Although the US does not have a formal nationalized HTA system in place, entities such as the Agency for Healthcare Research and Quality (AHRQ)’s Technology Assessment (TA) program, the Institute for Clinical and Economic Review (ICER), Blue Cross Blue Shield’s Evidence Street, and the Washington State Health Care Authority (WSHCA)’s HTA program are evolving to provide CU analyses for payers and pharmacy benefit managers (PBMs). Due to the influence CU models have on access decisions affecting entire populations, it is important to understand how these models are constructed and how differences in their design may impact results.