2,452
Views
201
CrossRef citations to date
0
Altmetric
Research Articles

Zero-Inflated and Hurdle Models of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction Intervention Trial

, Ph.D., , Ph.D. & , M.D.
Pages 367-375 | Published online: 22 Aug 2011
 

Abstract

Background: In clinical trials of behavioral health interventions, outcome variables often take the form of counts, such as days using substances or episodes of unprotected sex. Classically, count data follow a Poisson distribution; however, in practice such data often display greater heterogeneity in the form of excess zeros (zero-inflation) or greater spread in the values (overdispersion) or both. Greater sample heterogeneity may be especially common in community-based effectiveness trials, where broad eligibility criteria are implemented to achieve a generalizable sample. Objectives: This article reviews the characteristics of Poisson model and the related models that have been developed to handle overdispersion (negative binomial (NB) model) or zero-inflation (zero-inflated Poisson (ZIP) and Poisson hurdle (PH) models) or both (zero-inflated negative binomial (ZINB) and negative binomial hurdle (NBH) models). Methods: All six models were used to model the effect of an HIV-risk reduction intervention on the count of unprotected sexual occasions (USOs), using data from a previously completed clinical trial among female patients (N = 515) participating in community-based substance abuse treatment (Tross et al. Effectiveness of HIV/AIDS sexual risk reduction groups for women in substance abuse treatment programs: Results of NIDA Clinical Trials Network Trial. J Acquir Immune Defic Syndr 2008; 48(5):581–589). Goodness of fit and the estimates of treatment effect derived from each model were compared. Results: The ZINB model provided the best fit, yielding a medium-sized effect of intervention. Conclusions and Scientific Significance: This article illustrates the consequences of applying models with different distribution assumptions on the data. If a model used does not closely fit the shape of the data distribution, the estimate of the effect of the intervention may be biased, either over- or underestimating the intervention effect.

ACKNOWLEDGMENTS

This article is supported in part by the National Institute on Drug Abuse (NIDA) Clinical Trials Network grant U10 DA013035 (Dr. Nunes) and National Institute on Drug Abuse grant K24 DA022412 (Dr. Nunes).

Declaration of Interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of this article.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 65.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 987.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.