22
Views
0
CrossRef citations to date
0
Altmetric
Articles

A maximum entropy estimate of a wine rating’s distribution: results of tests using large samples of blind triplicates

ORCID Icon
Pages 68-74 | Received 22 Feb 2023, Accepted 22 Oct 2023, Published online: 11 Feb 2024
 

ABSTRACT

An acute difficulty in wine-ratings-related research, and in calculating consensus among judges, is that each rating is one observation drawn from a latent probability distribution that is wine- and judge-specific. One observation is almost no data at all. Minimizing a weighted sum of cross entropies has been proposed as a method of estimating the shape of the latent distribution using one observation. That method was tested on 30 blind triplicate wine ratings. This article replicates that test, presents results of tests at hypothetical boundary conditions, and then presents test results for a set of 30 blind triplicate ratings published by Cicchetti [(2014, August). Blind tasting of South African wines: A tale of two methodologies (AAWE Working Paper No. 164, 15 pages)] and a set of 1599 blind triplicate ratings from the California State Fair. The test results show that the minimum cross entropy solution is substantially more accurate than the analysis of a single rating.

Acknowledgements

The author thanks the Editor and reviewers for inciteful and constructive comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Shannon (Citation1948) defined information (i) about a bounded and discrete random variable (minxmax) with probability p(x) as i(p(x))=ln(1/p(x))=ln(p(x)) and entropy (H) as the expectation of information about that variable H(p)=minmaxp(x)ln(p(x)).

2 The distribution p in EquationEquation (1) is sometimes described as the true or target distribution and q as the estimate distribution, and minimization of cross entropy pulls the estimate toward the true.

3 Kulback–Leibler Divergence is DKL=minmaxp(x)ln(p(x)/q(x)) and I(p,q)=H(p)+DKL(p,q).

4 If n=0, the minimization in EquationEquation (2) solves to p^=u. If n=1, q|xois a one-hot vector and the minimization in Equation (2) solves to a distribution that skews between u and q|xo. As n the minimization in EquationEquation (2) solves to a PMF where p^q|xo. For a categorical PMF with a probability parameter for every category, that minimization solves to precisely p^=q|xo.

5 The author thanks Robert Hodgson for providing the data.

6 The form of probability mass function employed for the ‘true’ distribution and for p^ is a bounded, discrete, Gaussian function. See for example Bodington (Citation2022c). The parameters in that PMF are the mean and standard deviation of an unbounded Gaussian, and those parameters are estimated by minimizing the cross entropy between the observed distribution and the bounded, discrete Gaussian.

7 For example, for a rating of 3 within a score range of 1–5, the one-hot bounded and discrete probability distribution is (0, 0, 1, 0, 0).

8 Bodington (Citation2022a) calculated error using cross entropy and reported less than 20%.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.