172
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Locally efficient semiparametric estimator for zero-inflated Poisson model with error-prone covariates

ORCID Icon &
Pages 1092-1107 | Received 23 Apr 2020, Accepted 19 Oct 2020, Published online: 18 Nov 2020
 

Abstract

Overdispersion is a common phenomenon in count or frequency responses in Poisson models. For example, number of car accidents on a highway during a year period. A similar phenomenon is observed in electric power systems, where cascading failures often follows some distribution with inflated zero. When the response contains an excess amount of zeros, zero-inflated Poisson (ZIP) is the most favourable model. However, during the data collection process, some of the covariates cannot be accessed directly or are measured with error among numerous disciplines. To the best of our knowledge, little existing work is available in the literature that tackles the population heterogeneity in the count response while some of the covariates are measured with error. With the increasing popularity of such outcomes in modern studies, it is interesting and timely to study zero-inflated Poisson models in which some of the covariates are subject to measurement error while some are not. We propose a flexible partial linear single index model for the log Poisson mean to correct bias potentially due to the error in covariates or the population heterogeneity. We derive consistent and locally efficient semiparametric estimators and study the large sample properties. We further assess the finite sample performance through simulation studies. Finally, we apply the proposed method to a real data application and compare with existing methods that handle measurement error in covariates.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

Notes: ‘Correct (Pois)’ refers to the results in Liu and Ma [Citation7] which models Y correctly according the the data generating process while ‘Misspecified (ZIP)’ models Y as a zero-inflated Poisson model which is incorrect to the data generating process. ‘Local 1’ uses a posited η(δ)=δ3; ‘Local 2’ uses a posited η(δ)=δsin(δ). RC Normal is regression calibration where E(X|W) is calculated under a normal distribution. RC Uniform is regression calibration where E(X|W) is calculated under a uniform distribution. The truth is β=1.1. The dimension of Z is 3. For each method we report the mean, sample standard deviation (emp.sd), the average of the estimated standard deviation (est.sd) and the coverage of the estimated 95% confidence interval (95% CI).

Notes: ‘Local 1’ uses a posited η(δ)=δ3; ‘Local 2’ uses a posited η(δ)=δsin(δ). RC Normal is regression calibration where E(X|W) is calculated under a normal distribution. RC Uniform is regression calibration where E(X|W) is calculated under a uniform distribution. The truth is β=1.1. The dimension of Z is 10. For each method we report the mean, sample standard deviation (emp.sd), the average of the estimated standard deviation (est.sd) and the coverage of the estimated 95% confidence interval (95% CI).

Notes: Y is generated from zero-inflated negative binomial distribution. ‘Local 1’ uses a posited η(δ)=δ3; ‘Local 2’ uses a posited η(δ)=δsin(δ). RC Normal is regression calibration where E(X|W) is calculated under a normal distribution. RC Uniform is regression calibration where E(X|W) is calculated under a uniform distribution. The truth is β=0.4. The dimension of Z is 10. For each method we report the mean, sample standard deviation (emp.sd), the average of the estimated standard deviation (est.sd) and the coverage of the estimated 95% confidence interval (95% CI).

Notes: Sample size is 100, dimension of Z is 20. The truth is β=1.1. ‘Local 1’ uses a posited η(δ)=δ3; ‘Local 2’ uses a posited η(δ)=δsin(δ). RC Normal is regression calibration where E(X|W) is calculated under a normal distribution. RC Uniform is regression calibration where E(X|W) is calculated under a uniform distribution. For each method we report the mean, sample standard deviation (emp.sd), the average of the estimated standard deviation (est.sd) and the coverage of the estimated 95% confidence interval (95% CI).

Notes: LL and UL denote lower limit and upper limit of the 95% confidence confidence. ‘ZIP’ indicates fitting via a zero-inflated Poisson model. ‘Pois’ indicates fitting via a Poisson model. ‘Naive’ ignores the measurement error. ‘RC Normal’ is regression calibration where E(X|W) is calculated under a normal distribution for X. ‘RC Uniform’ is regression calibration where E(X|W) is calculated under a uniform distribution for X. ‘Local 1’ and ‘Local 2’ adopt working models η(δ)=sin2(δ) and η(δ)=δcos(δ) for E(X|δ,Z), respectively.

Additional information

Funding

This work was supported by Syracuse University [CUSE Grant].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,209.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.