266
Views
9
CrossRef citations to date
0
Altmetric
ARTICLES

Characteristics-based heuristics to select a logical distribution between the Poisson-gamma and the Poisson-lognormal for crash data modelling

&
Pages 1791-1803 | Received 05 Aug 2018, Accepted 02 Jul 2019, Published online: 22 Jul 2019
 

Abstract

Several studies have shown that the Poisson-lognormal (PLN) offers a better alternative compared to the Poisson-gamma (PG) when data are skewed while the PG is a more reliable option otherwise. However, it is not explicitly clear when the analyst needs to shift from the PG to the PLN – or vice versa. In addition, so far, the comparison has usually been accomplished using the goodness-of-fit statistics or statistical tests. Such metrics rarely give any intuitions into why a specific distribution or model is preferred over another. This paper addresses these topics by (1) designing characteristics-based heuristics to select a distribution between the PG and PLN, and (2) prioritizing the most important summary statistics to select a distribution between these two options. The results show that the kurtosis and percentage-of-zeros of data are among the most important summary statistics needed to distinguish between these two options.

Acknowledgements

The authors would like to thank the Safe-D UTC center for their support throughout the completion of this research. We also would like to thank Dr. Soma Dhavala for sharing his valuable insights and comments with us. [Disclaimer: The contents of this paper reflect the views of the authors, who are responsible for the facts and the accuracy of the information presented herein. This document is disseminated in the interest of information exchange. The research was funded partially or entirely by a grant from the U.S. Department of Transportation's University Transportation Centers Program. However, the U.S. Government assumes no liability for the contents or use thereof.]

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 The goodness of logic terminology was first used in the work of Miaou and Lord (Citation2003). The term implies that researchers and analysts should not solely select a model over another based on goodness of fit measures, but that they also need to look at the logic behind the selection of the ‘best model.’ More specifically, the model should appropriately characterize the crash generation process via the selected distribution, the functional form linking the number of crashes to the explanatory variables and how it relates to the boundary conditions.

2 We assumed that the mean of crash data varies from 0.1 to 20 in our simulation protocol. It is worth pointing out that there are instances that we may have a larger mean for crash data. However, in those situations, our analysis showed that the difference between using the Poisson-gamma and the Poisson-lognormal would become negligible and both will perform similarly when modelling data.

Additional information

Funding

Support for this research was provided in part by a grant from the U.S. Department of Transportation, University Transportation Centers Program via the Safety through Disruption (Safe-D) University Transportation Center (UTC) [451453-19C36].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 594.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.