ABSTRACT
An acute difficulty in wine-ratings-related research, and in calculating consensus among judges, is that each rating is one observation drawn from a latent probability distribution that is wine- and judge-specific. One observation is almost no data at all. Minimizing a weighted sum of cross entropies has been proposed as a method of estimating the shape of the latent distribution using one observation. That method was tested on 30 blind triplicate wine ratings. This article replicates that test, presents results of tests at hypothetical boundary conditions, and then presents test results for a set of 30 blind triplicate ratings published by Cicchetti [(2014, August). Blind tasting of South African wines: A tale of two methodologies (AAWE Working Paper No. 164, 15 pages)] and a set of 1599 blind triplicate ratings from the California State Fair. The test results show that the minimum cross entropy solution is substantially more accurate than the analysis of a single rating.
KEYWORDS:
Acknowledgements
The author thanks the Editor and reviewers for inciteful and constructive comments.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 Shannon (Citation1948) defined information (i) about a bounded and discrete random variable () with probability as and entropy (H) as the expectation of information about that variable .
2 The distribution in EquationEquation (1(1) (1) ) is sometimes described as the true or target distribution and as the estimate distribution, and minimization of cross entropy pulls the estimate toward the true.
3 Kulback–Leibler Divergence is and .
4 If , the minimization in EquationEquation (2(2) (2) ) solves to . If , is a one-hot vector and the minimization in Equation (2) solves to a distribution that skews between and . As the minimization in EquationEquation (2(2) (2) ) solves to a PMF where . For a categorical PMF with a probability parameter for every category, that minimization solves to precisely .
5 The author thanks Robert Hodgson for providing the data.
6 The form of probability mass function employed for the ‘true’ distribution and for is a bounded, discrete, Gaussian function. See for example Bodington (Citation2022c). The parameters in that PMF are the mean and standard deviation of an unbounded Gaussian, and those parameters are estimated by minimizing the cross entropy between the observed distribution and the bounded, discrete Gaussian.
7 For example, for a rating of 3 within a score range of 1–5, the one-hot bounded and discrete probability distribution is (0, 0, 1, 0, 0).
8 Bodington (Citation2022a) calculated error using cross entropy and reported less than 20%.