2,452
Views
201
CrossRef citations to date
0
Altmetric
Research Articles

Zero-Inflated and Hurdle Models of Count Data with Extra Zeros: Examples from an HIV-Risk Reduction Intervention Trial

, Ph.D., , Ph.D. & , M.D.
Pages 367-375 | Published online: 22 Aug 2011
 

Abstract

Background: In clinical trials of behavioral health interventions, outcome variables often take the form of counts, such as days using substances or episodes of unprotected sex. Classically, count data follow a Poisson distribution; however, in practice such data often display greater heterogeneity in the form of excess zeros (zero-inflation) or greater spread in the values (overdispersion) or both. Greater sample heterogeneity may be especially common in community-based effectiveness trials, where broad eligibility criteria are implemented to achieve a generalizable sample. Objectives: This article reviews the characteristics of Poisson model and the related models that have been developed to handle overdispersion (negative binomial (NB) model) or zero-inflation (zero-inflated Poisson (ZIP) and Poisson hurdle (PH) models) or both (zero-inflated negative binomial (ZINB) and negative binomial hurdle (NBH) models). Methods: All six models were used to model the effect of an HIV-risk reduction intervention on the count of unprotected sexual occasions (USOs), using data from a previously completed clinical trial among female patients (N = 515) participating in community-based substance abuse treatment (Tross et al. Effectiveness of HIV/AIDS sexual risk reduction groups for women in substance abuse treatment programs: Results of NIDA Clinical Trials Network Trial. J Acquir Immune Defic Syndr 2008; 48(5):581–589). Goodness of fit and the estimates of treatment effect derived from each model were compared. Results: The ZINB model provided the best fit, yielding a medium-sized effect of intervention. Conclusions and Scientific Significance: This article illustrates the consequences of applying models with different distribution assumptions on the data. If a model used does not closely fit the shape of the data distribution, the estimate of the effect of the intervention may be biased, either over- or underestimating the intervention effect.

ACKNOWLEDGMENTS

This article is supported in part by the National Institute on Drug Abuse (NIDA) Clinical Trials Network grant U10 DA013035 (Dr. Nunes) and National Institute on Drug Abuse grant K24 DA022412 (Dr. Nunes).

Declaration of Interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of this article.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.