MLLGFeb 26, 2020

A general framework for ensemble distribution distillation

arXiv:2002.11531v20.0026 citations
AI Analysis50

This work addresses the need for leaner models that retain uncertainty decomposition in machine learning, though it is incremental as it builds on existing ensemble distillation methods.

The paper tackles the problem of standard ensemble distillation erasing the uncertainty decomposition into aleatoric and epistemic components, and presents a general framework that preserves this decomposition while maintaining predictive performance on par with standard distillation.

Ensembles of neural networks have been shown to give better performance than single networks, both in terms of predictions and uncertainty estimation. Additionally, ensembles allow the uncertainty to be decomposed into aleatoric (data) and epistemic (model) components, giving a more complete picture of the predictive uncertainty. Ensemble distillation is the process of compressing an ensemble into a single model, often resulting in a leaner model that still outperforms the individual ensemble members. Unfortunately, standard distillation erases the natural uncertainty decomposition of the ensemble. We present a general framework for distilling both regression and classification ensembles in a way that preserves the decomposition. We demonstrate the desired behaviour of our framework and show that its predictive performance is on par with standard distillation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes