ABSTRACT
This work characterizes the dispersion of some popular random probability measures, including the bootstrap, the Bayesian bootstrap, and the Pólya tree prior. This dispersion is measured in terms of the variation of the Kullback–Leibler divergence of a random draw from the process to that of its baseline centring measure. By providing a quantitative expression of this dispersion around the baseline distribution, our work provides insight for comparing different parameterizations of the models and for the setting of prior parameters in applied Bayesian settings. This highlights some limitations of the existing canonical choice of parameter settings in the Pólya tree process.
Acknowledgments
We are grateful to Judith Rousseau for helpful comments. This work was done whilst Nieto-Barajas was visiting the Department of Statistics at the University of Oxford.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. The digamma function is defined as the logarithmic derivative of the gamma function, that is, . In similar fashion, the trigamma function is defined as the second derivative.
2. The variance of each element is defined in terms of first and second moments and rely on independence properties to compute them. Working out the algebra with patience and noting that ,
,
,
and
, the result is obtained.
3. Figure appears to show that remains constant, but this is an artefact due to the scale.
4. We use this notation to emphasize the fact that represents a random probability mass function, but taking values on the set
. A factor of n is needed for the vector to be distributed according to a multinomial distribution.
5. It is interesting to note that in the original work they only consider this special case.
6. ,
,
,
,
,
.
7. Proposition 3 of Ferguson [Citation4] states that any fixed density (measure) g absolutely continuous with respect to can be arbitrarily approximated pointwise with a draw f from a Dirichlet process.