ML LGApr 5, 2022

Aggregating distribution forecasts from deep ensembles

Benedikt Schulz, Lutz Köhler, Sebastian Lerch

arXiv:2204.02291v29.97 citationsh-index: 27Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for accurate uncertainty quantification in probabilistic forecasting for machine learning and statistics practitioners, offering incremental improvements over existing methods.

The paper tackles the problem of aggregating distribution forecasts from deep ensembles to improve predictive performance, showing that their proposed quantile aggregation framework often outperforms linear combination of densities across twelve benchmark datasets.

The importance of accurately quantifying forecast uncertainty has motivated much recent research on probabilistic forecasting. In particular, a variety of deep learning approaches has been proposed, with forecast distributions obtained as output of neural networks. These neural network-based methods are often used in the form of an ensemble, e.g., based on multiple model runs from different random initializations or more sophisticated ensembling strategies such as dropout, resulting in a collection of forecast distributions that need to be aggregated into a final probabilistic prediction. With the aim of consolidating findings from the machine learning literature on ensemble methods and the statistical literature on forecast combination, we address the question of how to aggregate distribution forecasts based on such `deep ensembles'. Using theoretical arguments and a comprehensive analysis on twelve benchmark data sets, we systematically compare probability- and quantile-based aggregation methods for three neural network-based approaches with different forecast distribution types as output. Our results show that combining forecast distributions from deep ensembles can substantially improve the predictive performance. We propose a general quantile aggregation framework for deep ensembles that allows for corrections of systematic deficiencies and performs well in a variety of settings, often superior compared to a linear combination of the forecast densities. Finally, we investigate the effects of the ensemble size and derive recommendations of aggregating distribution forecasts from deep ensembles in practice.

View on arXiv PDF Code

Similar