Given the uncertainties in initial conditions of weather and climate, forecasts should be and are increasingly issued in a probabilistic way. These forecasts then account for the uncertainties due to imperfectly known initial conditions and potentially also for model uncertainties. One issue frequently observed for probabilistic forecasts is that they tend to be not reliable, i.e. the forecasted probabilities are not consistent with the relative frequency of the associated observed events.

This project aims at developing and implementing post-processing approaches for (re-)calibrating the MiKlip decadal prediction ensemble; addressing the typical problems encountered for decadal predictions, i.e. relatively small ensemble sizes and limited availability of hindcast-observation pairs. Starting with normally distributed variables, strategies will be specifically tailored to problems encounter for decadal predictions (model drift, climate trend). In a later stage, non-normally distributed quantities will be considered, as the variables relevant to the end-user do not necessarily follow a Gaussian distribution (e.g. precipitation, humidity or wind gusts). Moreover, calibration methods for probabilistic forecasts of dichotomous and countable events (e.g., droughts) will be also considered. An implementation into the central evaluation system (CES) allows all other MiKlip projects to (re-)calibrate the ensemble predictions and prepares the calibration for operational use.

- Transferring the climate conserving recalibration (CCR) approach described by Weigel et al. (2009) to the MiKlip decadal prediction system. Initially this approach has been designed for normally distributed quantities for the seasonal timescale and a stationary setting, i.e. no model drift and no climate trend. This is in contrast to the Miklip decadal prediction system showing typically the following characteristics: non-stationarity (climate trend), model drifts away from its initialisation state towards model climatology, small ensemble size, and quantities which are non-normally distributed.

- Investigation of effects on the CCR due to climate trend, model drift due to initialisation, and small ensemble sizes, as well as potential ways of correcting deviations due to small-samples with respect to synthetically generated ensemble predictions, also called ‘toy model’ (Weigel et al., 2008).

- Investigation of possibilities to simultaneously re-calibrate and account for climate trend and model drifts.

- Transferring the previous results to quantities which are not well described by a normal distribution.

- Implementation of calibration of probabilistic forecasts regarding dichotomous and countable events.

- Implementation of the developed algorithms/software into the CES. Documentation of the software for operational use.

We provided a decadal forecast recalibration strategy (DeFoReSt), which simultaneously adjusts unconditional and conditional bias, as well as the ensemble spread while considering the typical setting of decadal predictions, i.e. model drift and a climate trend. The resulting parametric correction terms for bias, conditional bias and ensemble spread are functions of time and lead time and can be used to localise causes of non-calibrated forecasts. Here, DeFoReSt has been implemented into the MiKlip central evaluation system (CES).

For investigating the effect of DeFoReSt we have developed a toy model generating synthetic decadal forecast - observation pairs. Additionally, we applied DeFoReSt to decadal surface temperature forecasts from the MiKlip Prototype system. The results of these applications are presented in a publication.

The original approach of DeFoReSt assumes third order polynomials in lead time to capture conditional and unconditional biases, second order for dispersion, first order for start time dependency. We propose not to restrict orders a priori but use a systematic model selection strategy, based on non-homogeneous boosting, to identify the most relevant variables for recalibration.

Messner J. W., G. J. Mayr, and A. Zeileis, 2017: Nonhomogeneous boosting for predictor selection in ensemble postprocessing. Monthly Weather Review, 145(1):137–147.

Weigel, A.P., M. A. Liniger, and C. Appenzeller, 2008: Can multi-model combination really enhance the prediction skill of probabilistic ensemble forecasts? Quart. J. Royal Meteor. Soc., 134 (630):241 260.

Weigel, A.P., M. A. Liniger, and C. Appenzeller, 2009: Seasonal ensemble forecasts: Are recalibrated single models better than multimodels? Mon. Weather Rev., 137(4):1460–1479.

**Freie Universität Berlin, Institute for Meteorology**

Prof. Dr. Uwe Ulbrich

Prof. Dr. Henning Rust

M.Sc. Alexander Pasternack

**Max-Planck-Institut für Meteorologie**

Dr. Wolfgang Müller

**MeteoSwiss**

Dr. Mark Liniger

Dr. Jonas Bhend

**Feldmann, H.**
| Pinto, J.G., Laube, N., Uhlig, M., Moemken, J., Pasternack, A., Früh, B., Pohlmann, H., Kottmeier, C.

**Alexander Pasternack**
| Jonas Bhend, Mark A. Liniger, Henning W. Rust, Wolfgang A. Müller, Uwe Ulbrich