Abstract
This paper discusses regression models for which the univariate response variable is either zero with a non-zero probability or, otherwise, positive and continuous. Data of this form can arise in many situations; typically the response may be the 'consumption' of some material, food, etc.; for example the consumption of particular food or beverage by individuals, which it is hoped can be explained in terms of gender, social class, smoking behaviour, age, etc. We here illustrate the methods of the paper by reference to data on the consumption of packaging materials by UK cosmetic companies.
The paper compares three approaches to building regression models for such data; (i) the use of traditional Gaussian (Normal) regression models, (ii) the combined use of Binary logistic regression and Gamma estimation, and (iii) the use of the so-called Tweedie distribution. A brief description is given of the Tweedie distribution and the theoretical background for its use in the modelling of such data. The maximum likelihood estimation of the parameters of the Tweedie distribution is outlined and is seen to be straightforward to implement in a Generalised Linear Modelling environment.
The packaging consumption data are modelled by the three different approaches and the results compared.