870
Views
1
CrossRef citations to date
0
Altmetric
Professional

Predictive validation of modeled health technology assessment claims: lessons from NICE

Pages 1007-1012 | Accepted 02 Oct 2015, Published online: 07 Nov 2015

Abstract

The use of cost-effectiveness modeling to prioritize healthcare spending has become a key foundation of UK government policy. Although the preferred method of evaluation—cost-utility analysis—is not without its critics, it represents a standard approach that can arguably be used to assess relative value for money across a range of disease types and interventions. A key limitation of economic modeling, however, is that its conclusions hinge on the input assumptions, many of which are derived from randomized controlled trials or meta-analyses that cannot be reliably linked to real-world performance of treatments in a broader clinical context. This means that spending decisions are frequently based on artificial constructs that may project costs and benefits that are significantly at odds with those that are achievable in reality. There is a clear agenda to carry out some form of predictive validation for the model claims, in order to assess not only whether the spending decisions made can be justified post hoc, but also to ensure that budgetary expenditure continues to be allocated in the most rational way. To date, however, no timely, effective system to carry out this testing has been implemented, with the consequence that there is little objective evidence as to whether the prioritization decisions made are actually living up to expectations. This article reviews two unfulfilled initiatives that have been carried out in the UK over the past 20 years, each of which had the potential to address this objective, and considers why they failed to deliver the expected outcomes.

Introduction

Elsewhere in this issue of the Journal of Medical Economics, LangleyCitation1 eloquently argues for a move towards formal predictive validation of modeled health claims to prioritize drug expenditure. LangleyCitation1 rightly points out the acknowledged limitations of the modeling approach and that, in consequence, strategy decisions may be made on the basis of predictive claims that are difficult to test within a meaningful timeframe. This issue goes right back to the emergence of health economics as a practical tool for healthcare management in the late 1980s and early 1990s. Although at that time our attempts at implementing decisions based on best value for money were relatively crude, the issue was apparent even then. At the time there was no feedback to decision-makers within health systems and little if any thought appeared to have been given to how claims for clinical effectiveness and consequent resource utilization could be tracked and validated.

Managing healthcare budgets

In the mid-1990s, I had responsibility for the prescribing budget for a health district within the UK National Health Service (NHS) serving a population of around 250,000. Faced with the emergence of treatments such as ACE-inhibitors and statins which had convincing evidence of clinical benefit, we attempted to consider their costs and benefits at a crude level. Based on an assessment of the drug cost associated with implementing widespread secondary prevention strategies using ACE-inhibitors and statins and offsetting this against reductions in admissions for coronary events, we drew the general conclusion that the cost benefit ratio was likely to be favorable.

However, the means by which hospital care was funded was insensitive to admission levels and other activity levels, being simply predicated on the population size served. A reduction, therefore, in the number of patients admitted to coronary care did not yield a reduction in budget supply to that hospital, but we accepted at an intellectual level that the treatment was nonetheless cost-effective.

The National Institute for Health and Care Excellence (NICE) reference case

More recently, the tools available to us to model claims as part of an economic appraisal have become outwardly more sophisticated and our perceived ability to translate savings on paper into true reductions in spend have, on the face of it, been embraced by a growing audience. Whether modeled claims, in particular the NICE Reference Case, provide the appropriate framework for developing and validating claims, is open to questionCitation2. The NICE reference case is, by virtue of its relatively inflexible specification, open to accusation that it serves simply to generate a go/no go decision based on an arbitrary willingness-to-pay threshold. Although its design is sophisticated in an academic health economic context, one must question the validity of its predictive content. Modeling is inevitably speculative rather than definitive and, in the absence of routine post-hoc validation, its conclusions are subject to a high level of uncertainty. Of course, sensitivity analyses allow one to explore the potential impact of varying the input parameters and, thereby, identify key areas of uncertainty, but in the absence of any reliable external validation, we are left somewhat in the dark as to which, if any of the variables are in fact unreliable. Despite these limitations—which any practicing health economist would acknowledge, our mindset in relation to modeling tends to be uncritical, with a general acceptance of modeled claims, even if the real world benefits are uncertain, unconfirmed, or unachievable.

Consequences of the modeling paradigm

The adoption of such a ‘tick-the-box’ approach to health economics is superficially rigorous and consequently attractive from a pragmatic standpoint. Furthermore, it has allowed health economics to blossom from a minority academic pursuit into a flourishing business sector. The creation of the increasingly complex models demanded by health technology appraisal (HTA) groups, supported by organizations such as the International Society for Pharmacoeconomics and Outcomes Research (ISPOR), now requires a major investment of time and money for a company wishing to introduce a new treatment to the NHS. Whether this investment has led to more meaningful conclusions is a moot point. To a considerable extent, in the UK we have become a profession that feeds the beast of HTA, but have abandoned any attempt to explore the true microeconomics of healthcare. That is not to say that healthcare purchasers and policy-makers would not like to adopt a more robust and reproducible approach to economic analysis, but rather that a combination of tradition, pragmatism and a lack of sufficient incentives have discouraged anything other than very tentative advances down this path.

Stillborn health intervention initiatives

It is instructive to consider what may be usefully described as ‘stillborn initiatives’ in healthcare decision-making that could have perhaps led us to a more validated approach to health economics. The focus in this article is on the NHS, because that is the health system with which I am most familiar. I have no doubt examples can be found in many health systems, notably those that have taken a highly prescriptive approach to evaluating cost-outcomes claims.

The UK NHS arguably has the longest experience of integrating evidence-based economic models into its prescribing policies. Although the National Institute for Health and Care Excellence (NICE) is now recognized globally as one of the few national bodies that offers a comprehensive assessment of the cost-effectiveness of new therapies, it remains a work in progress that builds on initiatives started in the early 1990s. As information technology systems were developed over that course of the decade to monitor drug expenditure and hospital activity more accurately than had been previously possible, a political agenda was formed to formalize priority setting for the state funded NHS, at least in the context of prescribed drug therapies. The objective was to identify guideline mechanisms by which limited resources could be targeted more appropriately at those treatments that offered greatest value for money.

The first attempt to implement this strategy occurred some 5 years prior to the formation of NICE, when the introduction of β-interferon as a disease modifying agent in the management of multiple sclerosis posed a significant threat to prescribing budgets. A clinical trial of interferon-β-1b demonstrated a reduction in the number of exacerbations in the sub-set of patients with relapsing remitting multiple sclerosisCitation3. Although issues of study design and data analysis cast some doubt on the validity of the conclusions of this early study, the drug was, nonetheless, licensed and a strategy to deal with its prescription within the NHS became a necessity. Epidemiological studies suggested that there were potentially 40,000 patients in the UK who could be initiated on treatment at a cost of ∼£10,000 per patient per year. There was considerable concern that treatment of all potential patients would consume as much as 10% of the entire drug bill. At the time, no cost benefit analysis had been carried out and NHS mechanisms to carry out an appraisal did not yet exist. The response was an Executive Letter issued to the entire NHS, which mandated that only nominated neurological centers would be eligible to prescribe the treatment and that detailed prospective recording of patient response to treatment should be documentedCitation4. It was anticipated that, as this patient registry evolved, it would be possible to compare actual clinical progress against the results seen in the clinical trial, thereby enabling a real world cost benefit model to be generated. It was intended that this would then form the basis for a broader position for this drug and others like it within the NHS budget.

On the surface this patient registry proposal offered a potentially rational and achievable route, not only to verifying the claims of the clinical trial, but also providing a viable assessment of the health economics profile of this drug; one which did not rely on speculative modelling. The proposed mechanism was seized on enthusiastically by those with responsibility for prescribing budgets, although it has to be said with a considerably lesser degree of enthusiasm amongst neurologists and patient groups. In the event, a number of circumstances conspired against the achievement of this laudable goal: although new clinical trials of several disease modifying drugs in multiple sclerosis both confirmed clinical benefit and extended our understanding of appropriate treatment pathways, administrative difficulties meant that it proved impossible to collect the data to carry out the envisaged ongoing analysisCitation5.

With the passage of time and the inauguration of NICE in 1999, the initial idea for a patient registry was quietly shelved and the initiative passed to NICE to determine using their preferred (and narrowly academic) approach of cost utility analysis. The decision of NICE to adopt the cost utility approach as their primary health economic approach was somewhat controversial, given that reliable and reproducible methods of generating utility estimates were neither established nor accepted at the time, and were a rare inclusion in either randomized controlled trials or routine clinical practice. Additionally, at this stage, the proposed assessment framework was less than obvious to key decision-makers at hospital and commissioner levels who had not been schooled in this type of analysis and were not well-placed to evaluate the validity of any conclusions. Nonetheless, the MS treatments were duly considered by NICE early in its work program, with their formal guidance being published in January 2002Citation5.

NICE found, despite the improved evidence base, that β-interferon and glatiramer acetate both failed to reach the nominal willingness-to-pay (WTP) threshold of £30,000 per quality adjusted life year (QALY). Pressure from clinicians and patient groups, however, meant that the treatments were nonetheless made available on the NHS, subject to the findings of a resuscitated prospective health economic analysis along the line previously proposed 7 years earlierCitation6–8. Jointly funded by the NHS and four pharmaceutical companies that marketed disease modifying therapies, 5000 patients with multiple sclerosis were to be recruited and followed-up over a 10-year period starting in 2002, with clinical, cost and quality-of-life data being input into a prospective economic model on a continual basis. Based on the outputs from this model, it was envisaged that NHS funding for treatment would only be provided if an incremental cost-effectiveness ratio (ICER) of £36,000 per QALY was achieved and maintained. Although normally an incremental cost effectiveness ratio (ICER) of less than £20–30,000 per QALY is considered the acceptable threshold range for adoption by the NHS, in this case special circumstances were felt to apply, principally with regard to uncaptured social care benefits, so the higher threshold was decided onCitation6. If this target was not achieved then the price charged by the pharmaceutical companies would have to be reduced accordingly until the WTP threshold was reached. This ground-breaking concept, which was defined in detail by the Department of Health and equivalent bodies in Wales, Scotland and Northern Ireland, elegantly sidestepped the political hot potato of treatment rationing in a vulnerable group, whilst offering the prospect of genuine external validation of cost-effectiveness, tightly coupled to treatment prioritization and expenditureCitation6.

Unfortunately, as is so often the case with government-driven initiatives, the reality failed to match up to the aspiration. In 2009, the first peer reviewed publication relating to the scheme was publishedCitation7. After 7 years in operation the authors were able to report successful recruitment of the required sample size and demonstrated that follow-up data could be collected in an appropriate fashion from 70 UK centers. Although data on progression rates were published and found to be significantly better on average in those patients treated with disease modifying therapies, large variation between patient populations made it difficult to determine the relative contribution of the drugs and the natural history of the disease. Somewhat surprisingly, no information on the cost-effectiveness of these treatments was presented in the update. It was reported that tensions arising from the differing interests of the Department of Health, pharmaceutical companies, researchers, and patients had led to difficulties in carrying out this level of analysis, and the university group initially selected to take the analysis had withdrawn from the program.

Finally, in 2015, top line results for the economic analysis were published, albeit in redacted form to preserve commercial sensitivities, 20 years after the issue was first discussedCitation8. Based on aggregated data for multiple patient types receiving one of four drug regimens, the authors concluded that the mean cost utility fell below the pre-agreed WTP threshold of £36,000/QALY, consequently representing acceptable cost-effectiveness for the NHS. Although the results of this long awaited monitoring study have provided a fascinating attempt to provide some level of external validation of projected claims and the NICE proscribed modeling framework, the conclusions are of limited, and far from timely, relevance to NHS prescribing priorities. Having to wait some 15 years for a partial assessment of claims is of little if any benefit to decision-makers, particularly given that these treatments now represent the elder statesmen of disease modifying agents, with multiple alternative biological therapies now vying for the same budget. Presumably, we don’t want to wait for further validation of claims for these newer therapies in another 15 years.

One may regard this as an object lesson in the difficulties inherent in providing validation of model-based assessments where the claims made are either not in a form capable of validation in a timeframe that is practicable or where the proposed validation protocol has a high risk of falling over. Clearly, in this particular case, there was apparently no convenient short-term surrogate outcome on which to base an assessment of treatment efficacy, which has necessarily resulted in the extremely long time scale associated with this project. Even so, if we accept this belief, this would not have precluded the collection of data on treatment uptake and resource utilization, which has potential value on an ongoing real-time basis. However, rather than writing off the approach as impractical based on this single example, it is perhaps useful instead to consider other approaches which, although not primarily intended to provide predictive validation of economic models, also have the potential to achieve those goals.

Patient access schemes

From the date of its founding, NICE has adopted a largely binary approach to the demonstration of cost-effectiveness. Where modeled cost-utility results fall above a nominal £30,000/QALY threshold then acceptance for use within the NHS will only be forthcoming in exceptional circumstances. Although in the early years these decisions were often over-ridden on political grounds, for the past decade—with the notable exception of high cost cancer drugs—decisions made by NICE have been applied to determine which drugs NHS clinicians will be allowed to use. As this determination by NICE to implement their stated strategy became more robustly defended, pharmaceutical companies have increasingly sought ways in which they could make the cost of their treatment more palatable to NICE. This prompted the development of the patient access scheme (PAS). Although not an entirely new concept, a PAS proposal allowed a company faced with a rejection by NICE on cost per QALY threshold grounds to make provisions to alter the effective cost of their drug to the NHS.

Over the past decade, 56 such PAS agreements have been developed alongside NICE technology appraisals. The vast majority of these are relatively simple schemes that apply discounts, dosage caps or free provision of drug in certain circumstances which allow the ICER to fall below the WTP threshold. Clearly, schemes like this do not give us any opportunity to evaluate the validity of the underlying modeled claims; they simple permit the manipulation of the acquisition cost of inputs to deliver a desired numerical result. Indeed, the modeled claims are, in practical terms, quite irrelevant to access and treatment decisions.

However, a small number of PASs have been more sophisticated than this and attempt to relate payment to outcomes. Once such PAS related to the use of bortezomib in the treatment of multiple myeloma and was issued alongside NICE Technology Appraisal TA129 in October 2007Citation9. The primary cost utility model had yielded a base case ICER of £33,500 per QALY and in consequence the drug was not approved in the NHS by NICE. In response, the company developed a PAS, the Velcade response scheme, that stipulated that, if a biomarker response indicative of treatment benefit was not detectable within four cycles of treatment, then expenditure to date on the drug would be refunded. By factoring this scheme into the economic model, the manufacturer was able to reduce the ICER below the critical threshold. A similar scheme accompanied NICE Technology Appraisal TA176, which considered the use of cetuximab in the treatment of metastatic colorectal cancerCitation10. If after treatment there was a lack of observable tumor response, then a similar rebate of prior drug expenditure would be forthcoming from the manufacturer. This strategy also allowed the ICER to fall into the acceptable range and both drugs were consequently adopted for use in the NHS. However, once again, there was no attempt to actually validate the initial modeled claim. This was taken as given with the testable claim couched in terms that were quite at variance to the reference case criteria.

Although the primary objective of both these schemes was simply to achieve acceptance by NICE, the nature of the program meant that real world assessment of treatment response and drug expenditure became an accessible set of outcomes. In the long-term, these could potentially then be compared against the benefit claims used in the primary cost utility analysis, thereby offering a back door route, not to the predictive validation of modeled claims, but to an ad hoc claim for patient response.

In practice, however, these schemes proved to be complex to administer and the number of patients for whom rebates were claimed was substantially lower than expected. The PAS protocol made no attempt to collect data that, even in a longer-term follow-up, could be used to validate the initial modeled claims. To put it in a nutshell: if a claim based on, for example, the NICE reference case methodology is, in practical or achievable terms, non-testable then we don’t know whether the modelled claim is right, we don’t even know if it is wrong!

In 2009, a report was carried out for the UK Cancer Network Pharmacist Forum in order to understand the reasons behind this issueCitation11. Based on input from 31 NHS trusts, involving 756 patients entered into a PAS, the opinions of the pharmacists responsible for administering the schemes were universally negative for those schemes requiring documentation of outcomes. Identified issues related to difficulties in tracking patients, problems in ensuring claims met the very detailed requirements of the schemes and the practical difficulty of how credit notes, free stock or product replacement could be handled within the budgetary framework of the trust or mitigated against the successful implementation of the PAS. Even simple retrospective discount schemes were felt to be overly complex and, in consequence, 73% of the trusts questioned felt that they were not capable of taking on any more drugs with associated PAS.

These problems highlight the disconnect between decisions made by NICE, the practical relevance of modeled claims under the Reference Case, and the implementation of the PAS schemes within a hospital or commissioning organization. This is exemplified in the case of pazopanib in advanced renal cell carcinoma, which was subject to NICE technology appraisal TA215 in February 2011Citation12. The PAS approved by NICE included a straight price discount in addition to a rebate if there was lack of clinical response. Although approved by NICE, the experience in the field was that trusts were unwilling to enroll patients in the scheme owing to the problems highlighted above. In consequence, in August 2013, the response related element of the PAS was abandoned, with just the simple discount being retained. In the last 4 years, no other PAS has been approved involving anything more complex than a discount or provision of free drug to the hospital. We must, therefore, assume that the opportunity for external validation of model claims for this route has been recognized by health decision-makers as no longer being a viable option. So far as can be ascertained, there are no current activities within NICE to seek alternative routes to validate modeled claims, nor plans to initiate such a process.

Overview

We are left in 2015, 15 years after the formation of NICE and 20 years after the initial tentative steps taken with β-interferon, with no systematic approach to validation of modeled health economic conclusions against actual clinical and/or financial benefits achieved. One could argue that this is a legitimate pragmatic position in which to find ourselves. We have an approach to economic assessment that, while incapable of generating claims that are testable and, as such, of practicable benefit to health decision-makers, is one that generates voluminous documentation. Although it might be consider a chimera, at least it provides a clear-cut route to market for those companies that wish to access the NHS and have the fortitude to tick the NICE boxes. The fact that the threshold go/no go approach to cost utility is a relatively blunt instrument does not necessarily mean that decisions made by NICE are wrong or that they are right, merely that they are based on a crude measure that is claimed to allow relatively simple comparison of treatments across multiple diseases. Comparisons, of course, that could be right or wrong. We simply don’t know.

In this regard, it could be argued that the approach is analogous to the randomized controlled trial (RCT) which uses a formulaic approach to treatment comparison built on a very basic mathematical model with its origins in the 1920sCitation13. Although none of us would pretend that a RCT reflects the actual benefit we expect to see in clinical practice, it has been refined and standardized to such an extent that we are able to trust its results and place them within a clear and communally understood framework. So robust has been this embrace of the cult of the RCT, that it is only in very recent years that alternative models to assess treatment benefit have been considered plausible by the health community. Even now a naturalistic study or one that uses a Bayesian approach to statistical analysis is likely to be regarded with considerable suspicion by a general audience.

Rather than criticize bodies like NICE that implement the rather simplistic modeling approach we have come to recognize as standard in health economics, maybe we should question whether it is the broader health economics community that should be driving a move to more meaningful analysis instead, as outlined by others in this issue of the journalCitation1,Citation14. One can argue convincingly that there are practical and cultural difficulties in implementing the type of validation studies advocated by LangleyCitation1 (although the paper by Schommer et al.Citation14 presents a credible approach), but this does not mean that the attempt should not be made. Certainly the NHS data resources available are incomparably better than they were 15 years ago, and there is a far better prospect of delivering usable results in a timely fashion than would have been the case in the early days of NICE. Having lived with the reality of HTA-based cost utility modeling for the last 15 years, those of us in the UK have perhaps become hypnotized into the belief that this is the only possible way forward. It is easy for us to forget that the majority of the world’s health systems do not yet use formal health economic analysis as part of their pricing and reimbursement decision-making process. Hopefully, it is not too late to re-direct efforts down potentially more meaningful paths. It is only by reversing ourselves out of the current blind passage that we can hope to see health economics continue to develop as a discipline that is relevant and apposite to the demands that will be placed on it in the coming decades.

Transparency

Declaration of funding

This manuscript received no funding.

Declaration of financial/other relationships

None relevant to this manuscript declared. JME peer reviewers on this manuscript have no relevant financial or other relationships to disclose.

References

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.