298
Views
6
CrossRef citations to date
0
Altmetric
CLINICAL REVIEW

Obstructive Lung Disease Models: What Is Valid?

&
Pages 382-393 | Published online: 02 Jul 2009

Abstract

Use of disease simulation models has led to scrutiny of model methods and demand for evidence that models credibly simulate health outcomes. We sought to describe recent obstructive lung disease simulation models and their validation. Medline and EMBASE were used to identify obstructive lung disease simulation models published from January 2000 to June 2006. Publications were reviewed to assess model attributes and four types of validation: first-order (verification/debugging), second-order (comparison with studies used in model development), third-order (comparison with studies not used in model development), and predictive validity. Six asthma and seven chronic obstructive pulmonary disease models were identified. Seven (54%) models included second-order validation, typically by comparing observed outcomes to simulations of source study cohorts. Seven (54%) models included third-order validation, in which modeled outcomes were usually compared qualitatively for agreement with studies independent of the model. Validation endpoints included disease prevalence, exacerbation, and all-cause mortality. Validation was typically described as acceptable, despite near-universal absence of criteria for judging adequacy of validation. Although over half of recent obstructive lung disease simulation models report validation, inconsistencies in validation methods and lack of detailed reporting make assessing adequacy of validation difficult. For simulation modeling to be accepted as a tool for evaluating clinical and public health programs, models must be validated to credibly simulate health outcomes of interest. Defining the required level of validation and providing guidance for quantitative assessment and reporting of validation are important future steps in promoting simulation models as practical decision tools.

INTRODUCTION

Asthma and chronic obstructive pulmonary disease, collectively known as obstructive lung disease, affect more than 30 million Americans (Citation[1],Citation[2],Citation[3],Citation[4]). Understanding the natural history of these diseases and developing and implementing policies to reduce the morbidity and mortality from them, are important health goals (Citation[5]). Disease simulation models—quantitative representations of disease progression and factors that influence it—are often used to compare competing health interventions and explore their effectiveness and cost-effectiveness in situations where randomized or observational studies are logistically impossible, ethically questionable, or too resource-intensive to conduct. In an era when health care resources to address obstructive lung disease are shrinking and demands to demonstrate quantitative health impact of programs aimed at those diseases are increasing, there is escalating interest in the application of disease simulation models to forecast disease burden and explore effectiveness of alternative interventions. With increasing interest in disease simulation models as exploratory policy tools comes concomitant interest in the quality of these tools, especially in regard to establishing that they do a credible job of what they purport to do: quantitatively predict health outcomes under various scenarios. That is, there is a desire to demonstrate—insofar as is necessary—that the models are valid.

The value of disease simulation models is often diminished by practical skepticism that takes the form of such questions as, “How do I know the model is right?” and “How do I know I can trust the model results?” Addressing these concerns in a consistent, logical manner is essential to the effective use of these tools and their acceptance by decision makers. Various issues related to these validity concerns have been addressed in recent guidance on best practices in use of modeling for health policy decisions (Citation[6],Citation[7],Citation[8]). While these efforts to address model validation are helpful steps, additional guidance describing model validation standards would greatly assist clinical and public health policymakers in responding to important questions about model validity and, in so doing, enhance the acceptance of disease simulation models and the insights they offer.

To improve understanding of obstructive lung disease models in general and the methods used to validate them in particular, we sought to describe obstructive lung disease models reported in the recent peer-reviewed literature and characterize methods used to validate them. Our goal was to summarize the current “state of the art” for validation of obstructive lung disease simulation models and to recommend actions that may help to improve model validation methods, thus enhancing the usefulness of these tools.

MATERIALS AND METHODS

We define an obstructive lung disease simulation model as an analytic methodology (i.e., a sequence of logical mathematical computations) that links together evidence on obstructive lung disease from many sources to generate estimates of asthma or COPD disease events. In this sense, an obstructive lung disease simulation model is essentially a mechanistic representation of disease progression. An obstructive lung disease simulation model should be distinguished from the broader application that makes use of it, such as an economic evaluation. We have intentionally limited the scope of our assessment to examining the validity of disease simulation models themselves rather than the quality of economic evaluations and other applications that use them.

To identify recent obstructive lung disease simulation models, we conducted one primary and three secondary searches using PubMed and EMBASE databases. Details of the searches and search terms are given in the Supplement. Each search was limited to papers published in English between January 2000 and June 2006. All abstracts retrieved by these searches were hand-screened for eligibility, with particular emphasis on the Methods section of the abstract. Eligible abstracts included those that (a) reported on asthma or COPD (or synonyms such as emphysema) as the primary health outcomes of interest; and (b) appeared to describe use or development of a mathematical disease simulation model, as indicated by descriptors such as Markov model, state-transition model, prediction model, simulation model, disease simulation, disease progression model, mathematical model, dynamic population model, decision analytic model, life table model, and natural history model. For all abstracts that met screening criteria, we retrieved and reviewed the corresponding full publication for potential inclusion in this review.

If it was unclear if an abstract met screening criteria, the full paper was reviewed. Criteria for inclusion in this review included (a) publication between January 2000 and June 2006; (b) written in English; and (c) reported on use or development of a disease simulation model of asthma or COPD. We included as disease simulation models both state-transition models and dynamic population models. We excluded neural networks and simple decision tree models. If it could not be determined from the full paper if the model was indeed a disease simulation model, the paper was excluded, as were publications describing models still under development [e.g., the COPD model of Buist et al. (Citation[9], Citation[10])].

Of 580 English-language abstracts reviewed, 13 reported use of an obstructive lung disease simulation model. Two of these were enhanced versions of previously-reported models. Because the enhanced models incorporated new model structures that could affect model validation, they were included as separate models.

Publications describing the models were reviewed to abstract information about model characteristics, including model purpose, type (e.g., state-transition or dynamic population model), attributes (e.g., number of disease states, cycle length, and time horizon), heath outcomes modeled, and funding source. In addition to reviewing papers identified by the search strategy, we reviewed supplemental material about the models (e.g., technical reports) only if such information were publicly available and cited by the main papers. Because we were interested in assessing model validation using essentially the same body of information readily available to a typical audience of policy-makers, we intentionally made no attempt to obtain unpublished material or information available solely by personal communication with the authors.

Current terminology for describing validation methods is inconsistent, making categorization of validation methods challenging. Using an adaptation of terminology first proposed by David Eddy (Citation[11]), we defined four types of model validation for the purposes of this study: first-order validity (verification and debugging), second-order validity (comparison of model estimates with published data from studies used to develop the model), third-order validity (comparison of model estimates with published data from studies not used to develop the model), and predictive validity. We recognize that a diverse set of terms has been used to describe these processes, including calibration. Because the distinction between calibration and validation is tenuous and seems to depend on whether the comparison of modeled and observed values is used to re-specify the model (in which case this feedback into model development is often deemed calibration rather than validation), we forego the term calibration in favor of specific definitions of validation. Similarly, because there is inconsistent use of the terms internal validation and external validation to refer to various validation techniques, we eliminated those descriptors in favor of the explicit definitions here.

We defined first-order validity as the process by which model structure is examined for coding errors, logical inconsistencies, and accuracy of mathematical calculations. Sometimes referred to as debugging or model verification, this often includes logical checks on model programming conducted using null values, extreme values, and hypothetical conditions, in addition to the overall assessment of face validity proposed by Eddy as a test of first-order validity. First-order validation need not involve direct comparison of model predictions to other data sources. We defined second-order validation as comparison of model predictions with existing data from studies that were used to parameterize, construct, or develop the model. This includes, but is not limited to, comparison of model predictions to results of randomized trials used to develop subsets of model parameters.

We defined third-order validation as comparison of model predictions with existing data from studies that were not used to parameterize, construct, or develop the model. This could include, for example, comparison of model predictions with results of randomized trials that were not used to develop model inputs. Finally, we defined predictive validity as comparison of model predictions to previously unobserved data collected subsequent to and independent of model development, such as comparison of model results to clinical trials conducted after the model's inception. Predictive validation is so named because it involves comparison of the model's predicted outcomes with future patterns of health outcomes—that is, it is intended to gauge the ability of the model to predict future outcomes of interest. Obviously, predictive validation is only possible over time as data sources (such as clinical trials or reports of national trends in disease incidence) become available to which model results can be compared.

In characterizing model validation methods, we describe attributes of validation techniques including type of validation, health outcome endpoints used as the basis for comparing modeled to observed data, and number and type of studies from which data for comparisons are obtained, referred to as comparator studies. We describe methods by which comparisons were made between modeled and observed data (e.g., simulation of a trial cohort using the model) and criteria by which adequacy of validation is judged, noting the concordance metrics and statistical tests used, if any.

RESULTS

We identified 13 disease simulation models, six of asthma (Citation[12],Citation[13],Citation[14],Citation[15],Citation[16],Citation[17]) and seven of COPD (Citation[18],Citation[19],Citation[20],Citation[21],Citation[22],Citation[23],Citation[24]). Selected characteristics of the models are provided in . A majority (62%) were from Europe and funded by pharmaceutical companies (70%). Eleven were state-transition models; two were dynamic population models. Nine models were developed for specific cost-utility analyses (CUA) and four for exploratory analyses or future CUA. A summary of the types of validation used with these models is given in .

Table 1 Selected characteristics of obstructive lung disease simulation models

Table 2 Summary of types of validation used in obstructive lung disease simulation models

Eleven models included no mention of first-order validation; one did so implicitly and one explicitly. Seven (54%) models included tests of second-order validity, in which model results are evaluated for their agreement with data from studies used in the development of the model (). Most of the seven studies reporting tests of second-order validity appeared to use simulations of source study cohorts that compared modeled with observed outcomes (referred to hereafter as cohort simulation), although only two of the five explicitly described the validation method in this manner. A variety of health outcome endpoints was used for tests of second-order validity, including disease exacerbation, time spent in specific disease states, number of patients in specific diseases states, COPD prevalence, and all-cause mortality. Most second-order validation results were expressed as qualitative comparisons of a point estimate generated by the model with a point estimate of the analogous endpoint from the comparator study. Only one reported a statistical test of modeled versus observed endpoints (Citation[12]); otherwise, a priori criteria for judging the acceptability of validation were not reported.

Table 3 Second-order validation methods used in recent obstructive lung disease simulation models

An illustrative second-order validation is that reported by Oostenbrink et al. (Citation[22]), in which clinical trial data were used to parameterize the model (e.g., to generate transition probabilities between disease states), after which the model was populated with a hypothetical cohort of patients with baseline characteristics matching the trial patients. The model was run with this simulated cohort and a time horizon approximating the trial, and model-predicted outcomes were compared to outcomes observed in the trial. Oostenbrink et al. compared, for example, the model-predicted and observed mean number of COPD exacerbations over a 6-month time horizon, finding that the model predictions “were somewhat higher” than the mean number of exacerbations observed in the trial.

Seven of thirteen (54%) models included tests of third-order validity, in which modeled outcomes are assessed for agreement with results of studies not used to develop the model (). These models overlapped with but were not identical to the group of seven models reporting tests of second-order validity. A variety of validation endpoints was used including COPD prevalence, exacerbation rate, severity distribution, and mortality. Six of seven models reporting third-order validation did so by qualitatively contrasting modeled outcomes with published data. One used a cohort simulation to quantitatively assess agreement between model outcomes and the comparator study. Validation was typically described as acceptable in qualitative terms despite absence of criteria for judging adequacy of validation.

Table 4 Third-order validation methods used in recent obstructive lung disease simulation models

An illustrative third-order validation is that reported by Borg et al. (Citation[18]), in which the model-predicted annual COPD exacerbation rate was compared qualitatively to that reported in a randomized trial and to a retrospective study of COPD patients. Although Borg et al. do not report the exacerbation rates from the comparator studies, they conclude that the model-predicted “frequency of exacerbations… becomes 1.27 per patient per year, which is in line with published data.” No models included tests of predictive validity or plans for predictive validation.

DISCUSSION

Of the 13 obstructive lung disease simulation models we identified in the recent literature, ten had attempted model validation by assessing model outcomes for agreement with data from published studies (using what we refer to as second- or third-order validation). Comparing modeled outcomes with data from studies used in model development was the method of validation for seven of the models. In its simplest and most common form, this method involved comparing modeled outcomes to observed outcomes from the same data set used to develop the model (e.g., by matching or recreating the trial cohort). Another form of second-order validation involved comparing modeled outcomes to observed outcomes from a subset of data from studies used in model development but intentionally removed from model parameterization and “set aside” for validation. This splitting of a single population sample to facilitate validation seems an unlikely method to identify underlying biases in the model and carries the disadvantageous of reducing study power for model parameterization. Comparing modeled outcomes with data from studies used in model development (including the split sample method) represents a lower standard of validation than comparison to data independent of the model.

For both types of validation methods (i.e., those using data from studies used in model development and those using data independent of model development), a wide variety of validation endpoints was used, with little justification given for endpoint selection. It appeared that, for most models, the choice of validation endpoint was dictated by availability of endpoints in the comparator study, but this explanation was seldom explicit. Similarly, various types of comparator studies provided data to assess model predictions, but justification for choice of comparator studies was generally weak or absent. For several models, the exclusion of data that otherwise could have been used for validation was unexplained. There was near-universal absence of criteria for assessing adequacy of validation.

Currently there exists no clear consensus as to what constitutes an appropriate level of model validation for disease simulation models. However, recent guidelines suggest an emerging consensus that models should, at a minimum, demonstrate what we have called second-order validity (that is, agreement with data from studies used in model development), with failure to do so strongly suggestive “that the structure of the model is faulty” (Citation[6], Citation[11]) and should strive for third-order validity (agreement with data from studies not used in model development) (Citation[6],Citation[7],Citation[8]). There is disagreement about the need for models to demonstrate predictive validity, with some experts contending that using models based on historical data to predict future outcomes is meaningless, since such models cannot be expected to reflect unknown future conditions, while others believe that predictive validity is valuable but not essential. Certainly there are undeniable practical impediments to undertaking tests of predictive validity given that years of data would need to be accumulated before model predictions could be compared to observed outcomes.

If we assume second-order validity as the minimum level of appropriate validation, about half of the obstructive lung disease simulations models we identified demonstrated appropriate validation. However, even those that reported second-order validation tended to fail to establish clear validation goals and criteria for judging adequacy of validation. If we include all types of reported second- or third-order validation, regardless of the methodology or quality of the comparison involved, about three-quarters of the models could be said to include some sort of validation. At first glance, this proportion appears favorable, suggesting that most obstructive disease simulation models are indeed validated. However, this proportion should be interpreted with caution given that we used a broad definition of validation that included any attempt by model developers to compare modeled outcomes with other data, regardless of whether that activity was labeled validation. Although we believe it an encouraging finding that over half of the models report some sort of validation, this result does not imply that current validation methods are sufficiently developed, implemented, or documented.

As recognized in recent guidance on use of disease simulation models (Citation[7]), some notion of validity is needed to protect against incorrect conclusions and to protect users from being misled by disease simulation models. To promote acceptance of disease simulation models by decision-makers in the clinical and public health communities, there need to be intuitively reasonable and consistently reported validation goals. Our review of validation methods used with recent obstructive lung disease models suggests that the following activities may help achieve those goals. Specifically, experts in model development and end-users of disease simulation models in the clinical and public health communities could collaborate to

  1. Establish standard terminology for describing validation methods, including explicit definition of what constitutes calibration versus validation, and internal versus external and dependent versus independent data.

  2. Define levels of validation recommended in various decision contexts, which involves addressing how various uses of obstructive lung disease (or other disease) simulation models influence the stringency of validation needed to improve model credibility and acceptance.

  3. Explore in detail the rationale for and conflicting opinions on the need for predictive validity. Provide guidance on the circumstances in which the value of the information provided by predictive validation justifies the time and expense needed to undertake it.

  4. Develop practical guidelines for best practices in the conduct of validation. For example, practical guidelines for selection of validation endpoints and a priori definition of criteria for judging adequacy of validity (including concordance metrics and statistical testing methods) would encourage model developers to conduct and report validation in a systematic manner.

  5. Encourage standardized and detailed reporting of model development. This is similar to the recommendation by the Panel on Cost-Effectiveness in Health and Medicine (Citation[25]) that cost-effectiveness analyses be comprehensively described in a “technical report” providing analytic detail beyond what can typically be included in most journal reports. A similar technical report detailing model development, including validation, would increase availability of information needed to understand and evaluate disease simulation models.

  6. Standardize elements of validation reporting. Guidelines could specify that reports of validation include, at a minimum, descriptions of the type of validation, comparator studies and justification for their selection, methodologic details of how the comparison was performed (e.g., by describing cohort characteristics included in the model simulation) and quantitative side-by-side comparison of actual model-predicted and observed endpoints.

  7. Establish a “data resource library” of publicly-available datasets useful for validation of obstructive lung disease and other disease simulation models. These datasets would constitute a “gold standard” for comparison, and journals reporting analyses based on obstructive lung disease simulation models could be encouraged to mandate validation using them.

Finally, we acknowledge that a thorough assessment of model quality and credibility would include evaluation of a broader array of model characteristics, including model structure and quality of model inputs. An evaluation of these characteristics, while important, is outside the scope of this review.

In conclusion, we found that over half of recent obstructive lung disease simulation models report validation of some sort. However, the lack of detailed reporting of validation results and inconsistencies in validation methodology and interpretation make the assessment of validation difficult. For simulation modeling to be accepted as a tool for evaluating clinical and public health programs, models must be validated to credibly simulate health outcomes of interest. Defining the required level of validation and providing guidance for quantitative assessment and reporting of validation are important future steps in promoting simulation models as practical decision tools.

SUPPLEMENTAL MATERIAL

Search terms and number of abstracts and eligible papers obtained from each search are given here.

Primary Search

  • Search #1: (asthma OR COPD) AND (cost-effectiveness OR Markov OR decision analysis) yielded 482 abstracts, of which 23 met screening criteria and 10 met inclusion criteria.

Secondary Searches

  • Search #2: (asthma OR COPD) AND (natural history) AND (Markov OR decision analysis) yielded 4 abstracts, of which 3 met screening criteria and 3 met inclusion criteria. All three abstracts had been previously identified by Search #1.

  • Search #3 (asthma OR COPD) AND (life table model) yielded 53 abstracts, of which 12 met screening criteria, nine met inclusion criteria, and six were previously identified (by Search #1), hence yielding three additional papers.

  • Search #4: (asthma OR COPD) AND (model) AND (validation) yielded 41 abstracts, of which one met screening criteria, one met inclusion criteria and had been previously identified by Search #1.

Total

Ten publications from Search #1 and three publications from Search #3 for a total of 13 publications are included in our review.

No funding was received for conduct of this study and generation of this manuscript.

REFERENCES

  • Mannino D M, Homa D M, Akinbami L J, Ford E S, Redd S C. Chronic obstructive pulmonary disease surveillance—United States—197–2000. Morbid Mortal Weekly Rept 2002; 51: 1–16
  • Mannino D M, Homa D M, Akinbami L J, Moorman J E, Gwynn C, Redd S C. Surveillance for Asthma—United States, 1980–1999. Morbid Mortal Weekly Report 2002; 51: 1–13
  • U.S. Centers for Disease Control and Prevention. Asthma: General Information. US Centers for Disease Control and Prevention, Atlanta 4-6-2007, Report
  • U.S. Department of Health and Human Services. Facts about Chronic Obstructive Pulmonary Disease (COPD). Atlanta, US Centers for Disease Control and Prevention. 4-6-2007, Report
  • U.S. Department of Health and Human Services. Healthy People 2010: Understanding and Improving Health, 2nd Ed. US Government Printing Office Report, Washington, DC, 017-001-001-00-550-9. 11-1-2000
  • Philips Z, Ginnelly L, Sculpher M, Claxton K, Golder S, Riemsma R, et al. Review of guidelines for good practice in decision-analytic modelling in health technology assessment. Health Technol Assess 2004; 8: iii–xi1
  • Weinstein M C, Toy E L, Sandberg E A, Neumann P J, Evans J S, Kuntz K M, et al. Modeling for health care and other policy decisions: uses, roles, and validity. Value Health. 2001; 4: 348–361
  • Weinstein M C, O'Brien B, Hornberger J, Jackson J, Johannesson M, McCabe C, et al. Principles of good practice for decision analytic modeling in health-care evaluation: report of the ISPOR Task Force on Good Research Practices—Modeling Studies. Value Health 2003; 6: 9–17
  • Buist A S, Vollmer W M, Sullivan S D, Weiss K B, Lee T A, Menezes A M, et al. The Burden of Obstructive Lung Disease Initiative (BOLD): rationale and design. COPD J 2005; 2: 277–283
  • Lee T A, Rutten-van Molken M P. Economic modeling in chronic obstructive pulmonary disease. Proceedings of the American Thoracic Society 2006; 3: 630–634
  • Eddy D M. Technology Assessment: The Role of Mathematical Modeling. Assessing Medical Technologies, F Mosteller. National Academy Press, Washington, DC 1985; 144–160
  • Combescure C, Chanez P, Saint-Pierre P, Daures J P, Proudhon H, Godard P. Assessment of variations in control of asthma over time. Eur RespirJ 2003; 22: 298–304
  • Fuhlbrigge A L, Bae S J, Weiss S T, Kuntz K M, Paltiel A D. Cost-effectiveness of inhaled steroids in asthma: impact of effect on bone mineral density. J Allergy Clin Immunol 2006; 117: 359–366
  • Paltiel A D, Fuhlbrigge A L, Kitch B T, Liljas B, Weiss S T, Neumann P J, et al. Cost-effectiveness of inhaled corticosteroids in adults with mild-to-moderate asthma: results from the asthma policy model. J Allergy Clin Immunol 2001; 108: 39–46
  • Price M J, Briggs A H. Development of an economic model to assess the cost effectiveness of asthma management strategies. Pharmacoeconomics 2002; 20: 183–194
  • Saint-Pierre P, Combescure C, Daures J P, Godard P. The analysis of asthma control under a Markov assumption with use of covariates. Stat Med 2003; 22: 3755–3770
  • Wild D M, Redlich C A, Paltiel A D. Surveillance for isocyanate asthma: a model based cost effectiveness analysis. Occup Environ Med 2005; 62: 743–749
  • Borg S, Ericsson A, Wedzicha J, Gulsvik A, Lundback B, Donaldson G C, et al. A computer simulation model of the natural history and economic impact of chronic obstructive pulmonary disease. Value Health 2004; 7: 153–67
  • Feenstra T L, van Genugten M L, Hoogenveen R T, Wouters E F, Rutten-van Molken M P. The impact of aging and smoking on the future burden of chronic obstructive pulmonary disease: a model analysis in the Netherlands. Am J Respir Crit Care Med 2001; 164: 590–596
  • Hoogendoorn M, Rutten-van Molken M P, Hoogenveen R T, van Genugten M L, Buist A S, Wouters E F, et al. A dynamic population model of disease progression in COPD. Eur RespirJ 2005; 26: 223–233
  • Johansson P M, Tillgren P E, Guldbrandsson K A, Lindholm L A. A model for cost-effectiveness analyses of smoking cessation interventions applied to a Quit-and-Win contest for mothers of small children. Scand J Public Health 2005; 33: 343–352
  • Oostenbrink J B, Rutten-van Molken M P, Monz B U, Fitz Gerald J M. Probabilistic Markov model to assess the cost-effectiveness of bronchodilator therapy in COPD patients in different countries. Value Health 2005; 8: 32–46
  • Sin D D, Golmohammadi K, Jacobs P. Cost-effectiveness of inhaled corticosteroids for chronic obstructive pulmonary disease according to disease severity. Am J Med 2004; 116: 325–331
  • Spencer M, Briggs A H, Grossman R F, Rance L. Development of an economic model to assess the cost effectiveness of treatment interventions for chronic obstructive pulmonary disease. Pharmacoeconomics 2005; 23: 619–637
  • Gold M R, Siegel J E, Russell L B, Weinstein M C. Cost-Effectiveness in Health and Medicine. Oxford University Press, New York 1996

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.