ABSTRACT
Using daily prices from 496 corn cash markets for July 2006–February 2011, this study investigates short-run forecast performance of 31 individual and 10 composite models for each market at horizons of 5, 10, and 30 days. Over the performance evaluation period September 2010–February 2011, two composite models are optimal across horizons for different markets based on the mean-squared error. For around half of the markets at the horizon of 5 days and most of them at 10 and 30 days, the mean-squared error of a market's optimal model is significantly different from those of at least other 23 models evaluated for it. Root-mean-squared error reductions through switching from non-optimal models to the optimal are generally around 0.40%, 0.55%, and 0.87% at horizons of 5, 10, and 30 days.
Acknowledgments
The author acknowledges Kevin McNew and Geograin, Inc of Bozeman, Montana for generously providing the data used in the analysis in this paper.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 Abbreviations: RMSE – root-mean-squared error; ARMA – autoregressive moving average; VAR – vector autoregressive; BVAR – Bayesian vector autoregressive; VECM – vector error correction; BVECM – Bayesian vector error correction; CBOT – Chicago Board of Trade; AR – autoregressive; MSE – mean-squared error.
2 I would like to thank a reviewer for this suggestion.
3 The 16 states are Arkansas, Iowa, Illinois, Indiana, Kansas, Kentucky, Michigan, Minnesota, Missouri, North Dakota, Nebraska, Ohio, Oklahoma, Pennsylvania, South Dakota, and Wisconsin, covering 89% of national corn production [Citation50].
4 On days such as holidays where prices are missing in each market, we omit the observations and assume smooth data continuity [Citation28].
5 Log prices are used because the log transformation stabilizes the variance of the underlying raw series in this study and could be beneficial for forecasting [Citation45]. Readers are referred to Lükepohl and Xu [Citation45] for a detailed investigation into conditions under which taking logs is beneficial for forecasting. Unless stated otherwise, we will refer to ‘log prices’ as ‘prices’ hereafter.
6 The AR model considered is , where is a cash price at time t, c is a constant, and is an error term.
7 Clements and Hendry [Citation19] pointed out that forecasting through differences rather than levels might protect against mean shifts in the dependent variable. The VAR- and BVAR-type model considered are , where and are prices of the futures and a cash market at time t, and are constants, and and are error terms. The VECM- and BVECM-type model considered are , where δ is a constant. For the BVAR- and BVECM-type model, ‘Minnesota-style priors’ [Citation23,Citation44] are used, and the overall tightness, rate of decay, and weight parameter are set at 0.1, 1.0, and 0.5, respectively.
8 Robertson and Tallman [Citation59] pointed out that extensions to ‘Minnesota-style priors’ revolving around long-run properties could benefit macroeconomic variable forecasting. Kadiyala and Karlsson [Citation35,Citation36] extended “Minnesota-style priors' by allowing for dependence among equations and found possible improvements.
9 The current study restricts individual models to parametric linear projections. Nonlinear forecasting through, for example, nonparametric and semi-parametric models is out of scope. It, however, is worth noting that composite models pooling linear and nonlinear forecasts could outperform those combining only linear ones (see e.g., [Citation7,Citation64]). Readers are referred to, for example, [Citation2,Citation57,Citation68] for discussions that focus on nonlinear forecasting.
10 Efforts in selecting the optimal window size include [Citation16,Citation32,Citation55]. Many other flexible modeling choices exist, including employing different model specification methods, time varying parameterizations, and discounted least squares [Citation15].
11 While older observations might be less relevant for forecasting, it has been pointed out in the literature [Citation15] that simply dropping them is somewhat extreme in the sense that they may not be completely irrelevant. A less extreme method is the discounted least squares. It is not that common in the economic forecasting literature as compared to the rolling window approach but its potential for macroeconomic predictions has been revealed in recent studies (see e.g., [Citation8,Citation66]). The basic idea is to use all available data for model parameter estimates but weight the observations with a factor that gradually shrinks weights for older observations to zero.
12 Optimal lags are determined based on the chi-squared distributed test statistics with the maximum lag fixed at 10 and the significance level at 5%, where T is the number of observations, c is a degrees of freedom correction factor [Citation62], and and are determinants of the error covariance matrices from the unrestricted and restricted model, respectively. While fixing the lag length based on the initial model fitting sample might seem naive, it could benefit forecasting. For example, Clark and McCracken [Citation16] showed that ignoring structural changes can lead to improved forecasting accuracy, depending on their types and magnitudes. This actually is a bias-variance trade-off.
13 Composite forecasts take a linear combination form: , where and are the h-step ahead forecasts of a composite and the ith individual model, is the estimated weight of the ith individual model, and m is the number of individual models. The previous best forecast approach assigns to the optimal individual prediction and to others. The equal-weighted average approach assigns to all individual forecasts. The inverse mean-squared error approach sets , where is the mean-squared error associated with the ith individual model. Three variations of the least-squared estimates of combination weights approach are: (a) the unrestricted with constant case, i.e. , where is the actual value at time t, , and ; (b) the unrestricted without constant case. i.e. ; (c) the constrained without constant case, i.e. , s.t. . The bias-adjusted mean approach is based on the projection of the equal-weighted average forecast , i.e. . The shrinkage approach sets , where is taken from the constrained without constant case of the least-squared estimates of combination weights approach, , T is the size of the forecasted sample, and κ is the shrinkage parameter for which 0.25 and 1 are considered in this study. The odds matrix approach calculates ω based on , where I is an identity matrix, O is an matrix whose ijth element is , , and is the number of times the ith individual model provides a smaller absolute error as compared to the jth in the historical sample. Readers are referred to Timmermann [Citation69] and Capistrán and Timmermann [Citation12] for more technical details.
14 Five-, ten-, and thirty-day correspond to one-, two-, and six-week ahead forecasts, respectively.
15 This approach would be precisely correct for one-step ahead forecasts if follows a normal distribution. While assuming normality will generally not be correct, it seems plausible to guess that the t critical value is appropriate as a comparison of the MDM test statistic [Citation30]. A further simulation study [Citation30] reveals that the modified test statistic and using the t , rather than standard normal, critical value as a comparison of the statistic both contribute to the performance improvement – the former somewhat more than the latter.
16 Choices of 17, 5, and 23 are arbitrary to some extent. They are around half of the number of models other than the optimal one considering individual, composite, and all forecasts, respectively. For the same model j, red and blue points sometimes are plotted in the same subfigure but sometimes are not. This decision is purely based on the consideration of space and visualization.
17 Numerical results of MSEs and RMSEs across markets, models, and horizons not provided here to save space are available upon request.
18 A forecast model that is optimal for less than 10 cash markets is not discussed in detail.
19 At the short horizon H5, the result that the model #24 is optimal for 110 markets reveals that rolling windows and lag structure re-estimation are important.
20 Plots of forecast error series are available upon request.
21 Performance of this procedure was found to decline at the horizon of three quarters. The authors [Citation21] considered high variabilities at this horizon as a potential reason.
22 Separate from one-quarter ahead forecasts, Colino and Irwin [Citation20] also provided results for two- and three-quarter ahead forecasts. One-quarter ahead forecasts are selected to illustrate the average RMSE reduction because they represent a horizon that is closest to those considered in the current study.