Simple Regularisation for Uncertainty-Aware Knowledge Distillation
This work addresses the computational and memory constraints of deploying ensemble models in fields like medicine and autonomous systems, offering an incremental improvement over existing distillation techniques.
The paper tackles the impracticality of neural network ensembles for real-world deployment by proposing a simple regularization method for knowledge distillation, which preserves ensemble diversity, accuracy, and uncertainty estimation in a single model, as demonstrated on various datasets and tasks.
Considering uncertainty estimation of modern neural networks (NNs) is one of the most important steps towards deploying machine learning systems to meaningful real-world applications such as in medicine, finance or autonomous systems. At the moment, ensembles of different NNs constitute the state-of-the-art in both accuracy and uncertainty estimation in different tasks. However, ensembles of NNs are unpractical under real-world constraints, since their computation and memory consumption scale linearly with the size of the ensemble, which increase their latency and deployment cost. In this work, we examine a simple regularisation approach for distribution-free knowledge distillation of ensemble of machine learning models into a single NN. The aim of the regularisation is to preserve the diversity, accuracy and uncertainty estimation characteristics of the original ensemble without any intricacies, such as fine-tuning. We demonstrate the generality of the approach on combinations of toy data, SVHN/CIFAR-10, simple to complex NN architectures and different tasks.