A maximum entropy estimate of a wine rating’s distribution: results of tests using large samples of blind triplicates: Journal of Wine Research: Vol 35, No 1

Views

CrossRef citations to date

Altmetric

ABSTRACT

An acute difficulty in wine-ratings-related research, and in calculating consensus among judges, is that each rating is one observation drawn from a latent probability distribution that is wine- and judge-specific. One observation is almost no data at all. Minimizing a weighted sum of cross entropies has been proposed as a method of estimating the shape of the latent distribution using one observation. That method was tested on 30 blind triplicate wine ratings. This article replicates that test, presents results of tests at hypothetical boundary conditions, and then presents test results for a set of 30 blind triplicate ratings published by Cicchetti [(2014, August). Blind tasting of South African wines: A tale of two methodologies (AAWE Working Paper No. 164, 15 pages)] and a set of 1599 blind triplicate ratings from the California State Fair. The test results show that the minimum cross entropy solution is substantially more accurate than the analysis of a single rating.

KEYWORDS:

Acknowledgements

The author thanks the Editor and reviewers for inciteful and constructive comments.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Shannon (Citation1948) defined information (i) about a bounded and discrete random variable ( $\min \leq x \leq \max$ ) with probability $p (x)$ as $i (p (x)) = \ln (1 / p (x)) = - \ln (p (x))$ and entropy (H) as the expectation of information about that variable $H (p) = - \sum_{\min}^{\max} p (x) \ln (p (x))$ .

2 The distribution $p$ in EquationEquation (1(1) $I (p, q) = - \sum_{\min}^{\max} p (x) \ln (q (x))$ (1) ) is sometimes described as the true or target distribution and $q$ as the estimate distribution, and minimization of cross entropy pulls the estimate toward the true.

3 Kulback–Leibler Divergence is $D_{KL} = \sum_{\min}^{\max} p (x) \ln (p (x) / q (x))$ and $I (p, q) = H (p) + D_{KL} (p, q)$ .

4 If $n = 0$ , the minimization in EquationEquation (2(2) $\arg [\hat{p}] = argmin [(\frac{1}{1 + n}) \cdot I (u, \hat{p}) + (1 - \frac{1}{1 + n}) \cdot I (q | x^{o}, \hat{p})]$ (2) ) solves to $\hat{p} = u$ . If $n = 1$ , $q | x^{o}$ is a one-hot vector and the minimization in Equation (2) solves to a distribution that skews between $u$ and $q | x^{o}$ . As $n \to \infty$ the minimization in EquationEquation (2(2) $\arg [\hat{p}] = argmin [(\frac{1}{1 + n}) \cdot I (u, \hat{p}) + (1 - \frac{1}{1 + n}) \cdot I (q | x^{o}, \hat{p})]$ (2) ) solves to a PMF where $\hat{p} \approx q | x^{o}$ . For a categorical PMF with a probability parameter for every category, that minimization solves to precisely $\hat{p} = q | x^{o}$ .

5 The author thanks Robert Hodgson for providing the data.

6 The form of probability mass function employed for the ‘true’ distribution and for $\hat{p}$ is a bounded, discrete, Gaussian function. See for example Bodington (Citation2022c). The parameters in that PMF are the mean and standard deviation of an unbounded Gaussian, and those parameters are estimated by minimizing the cross entropy between the observed distribution and the bounded, discrete Gaussian.

7 For example, for a rating of 3 within a score range of 1–5, the one-hot bounded and discrete probability distribution is (0, 0, 1, 0, 0).

8 Bodington (Citation2022a) calculated error using cross entropy and reported less than 20%.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

A maximum entropy estimate of a wine rating’s distribution: results of tests using large samples of blind triplicates

Information for

Open access

Opportunities

Help and information

A maximum entropy estimate of a wine rating’s distribution: results of tests using large samples of blind triplicates

ABSTRACT

Acknowledgements

Disclosure statement

Notes

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature