1,212
Views
20
CrossRef citations to date
0
Altmetric
Original Articles

The rationale behind the success of multi-model ensembles in seasonal forecasting — II. Calibration and combination

, &
Pages 234-252 | Received 06 Apr 2004, Accepted 24 Sep 2004, Published online: 15 Dec 2016
 

Abstract

The DEMETER multi-model ensemble system is used to investigate the enhancement in seasonal predictability that can be achieved by calibrating single-model ensembles and combining them to issue multi-model predictions. The forecast quality of both deterministic and probabilistic predictions is assessed and compared to the skill of a simple multi-model ensemble where all the single models are equally weighted. Both calibration and combination are carried out using crossvalidation. Single-model seasonal ensembles are calibrated using canonical correlation analysis for model adjustment and variance inflation for reliability enhancement. Results indicate that both model adjustment and inflation increase the skill of tropical predictions for single-model ensembles, provided that the training time series are long enough. Some improvements are also found for extratropical areas, although mostly due to an increase of reliability associated with the inflation. The beneficial impact of calibration is smaller for the simple multi-model than for the single-model ensembles due to the relatively high reliability of the former. The raw single-model predictions are also linearly combined using grid-point multiple linear regression to create an optimized multi-model system. Results indicate that the forecast quality of the simple multi-model ensemble is generally difficult to improve using multiple linear regression due to the lack of robustness of the regression coefficients. As in the case of the calibration, longer time series would be preferred to achieve a significant forecast quality improvement. Over the tropics, a multiple linear regression, that uses the principal components of the model anomalies for the target area as predictors indicates a substantial gain in skill even with the available sample size. The implications of these results in an operational context are discussed.