![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
ABSTRACT
This letter deals with a test on forecast bias in predicting independent binary outcomes, where the outcomes are either 1 or 0, and the predictions are probabilities. The test concerns two parameter restrictions in a simple logit model. Size-corrected power experiments show remarkable power.
I. Introduction and motivation
This letter deals with a test on forecast bias in predicting independent binary outcomes, where the outcomes are either 1 or 0, and the predictions are probabilities. There is no need to know how the predictions were created, that is, the predictions can be based on a logit model (Cramer Citation1991) or a probit model or a linear probability model, or by expert judgement.
In a standard regression model for continuous outcomes, one can consider the auxiliary regression which links the predictions with the realizations. If realizations and forecasts
would be continuous variables, and these can be cross sectional data or time series data, where the forecast sample is
, then one can examine bias using the auxiliary regression
The parameters are estimated using Ordinary Least Squares. The Wald type test of interest concerns the hypothesis that and
, jointly. Under the null hypothesis, there is no forecast bias. This regression is called the Mincer Zarnowitz regression, see Mincer and Zarnowitz (Citation1969).
In this letter, I propose a similar test but now for independent binary outcomes, that is, there are realizations that can be either 1 or 0, and the predictions are estimated probabilities that the outcome is 1. The question is if these probabilities are unbiased or not. Note that if the predictions are also 1 or 0, one can resort to variants of tests on hit rates, see for example Franses and Paap (Citation2001, page 65), but a test for the hit rate is not the focus here. The new test turns out to be similarly easy as based on the Mincer Zarnowitz regression. Power simulations show that the test works quite well. The test for forecast bias is illustrated using the 2018 Goldman Sachs predictions for the football teams that supposedly would make it to the second round of the 2018 World Championship football in Russia.
II. The main idea
Consider N forecasts and N observations
, where
can take values 1 or 0, and where the forecasts are numbers in between 0 and 1. An example is the dataset in , which refers to the Goldman Sachs forecasts. The interest is to see if there is forecast bias.
Table 1. Realizations and forecasts concerning surviving the first round of the 2018 World Cup in Russia. Data source is Exhibit 2 of the 11 June 2018 Global Macro Research report of the Goldman Sachs Group, Inc
The key identity to design the test is
where simple algebra gives
The middle term can be recognized as the expression for the logit model (Franses and Paap Citation2001, page, 54), that for a single variable is given by
Hence, a Mincer Zarnowitz type test for the null hypothesis of no forecast bias can be based on the logit model
with the logistic function, and on the Wald test for
and
, jointly.
III. Simulations
To see how the test works in practice, I consider various simulation experiments. As there is no such test around,Footnote1 I focus only on the proposed test. For sample size N, I generate 2 N observations, where the first half will be used to estimate the model parameters, and the second half will be used to create and evaluate the forecasts. The Data Generating Process (DGP) is
where The binary data on
are created as follows:
Next, the parameters in the logit model are estimated using Maximum Likelihood, see Franses and Paap (Citation2001, section 4.2). The estimated parameters are used for the second set of N observations to create . Finally, the logit model in (1) is considered and the Wald test is computed. I use 1000 simulation runs.
First, I examine if the test has proper size. This turns out not to be the case, as even in case , the rejection rate is 16.2%. To obtain a new 5% critical value, the 95th Wald test value is taken, and this is equal to 10.71. With this new critical value, size-corrected power experiments can be run.
To create data under the alternative hypothesis, I replace observations on in the second set of N observations. Each time 5%, 10%, 15%, until 90% of the observations with
is replaced by
. The size-corrected power for
is displayed in .
Clearly, the size-corrected power is quite high, even for small samples.
IV. Illustration
To illustrate the new test, consider the data in . There are 32 countries of which 16 attained the knockout stage of the 2018 World Cup football tournament. Attaining this stage is labelled as 1, having to leave the tournament after the first round is 0. The third column of presents the probabilities assigned by Goldman Sachs of attaining the second round.
Table 2. Size-corrected power. The 5% critical value is set at 10.71
The Maximum Likelihood based parameter estimates (using Eviews version 8.0) are 0.033 (0.440) and 1.596 (0.569) for and
, respectively, with estimated standard errors in parentheses. The McFadden R-squared (Franses and Paap Citation2001, page, 64) is 0.282, so the logit model fits the data quite well. Finally, the Wald test for the joint hypothesis that
and
appears to equal 1.100, which is substantially smaller than 10.71. This suggests that the Goldman Sachs forecast were unbiased.
Acknowledgments
Thanks to Richard Paap for helpful comments and Max Welz for programming.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
1 There are tests on the so-called hit rate, that is the fraction of correctly predicted 1 and 0 observations, but that concerns another feature of the forecasts.
References
- Cramer, J. S. 1991. The Logit Model: An Introduction for Economists. New York: Routlegde.
- Franses, P. H., and R. Paap. 2001. Quantitative Models in Marketing Research. Cambridge UK: Cambridge University Press.
- Mincer, J., and V. Zarnowitz. 1969. “The Evaluation of Economic Forecasts.” In Economic Forecasts and Expectations, edited by J. Mincer. New York: National Bureau of Economic Research.