LGFeb 13, 2023

Fixing Overconfidence in Dynamic Neural Networks

Lassi Meronen, Martin Trapp, Andrea Pilzer, Le Yang, Arno Solin

arXiv:2302.06359v416.526 citationsh-index: 29Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of overconfidence in dynamic neural networks, which is crucial for efficiently allocating computational resources in deep learning applications, though it is incremental as it builds on existing uncertainty quantification techniques.

The paper tackles the problem of poor uncertainty estimates in dynamic neural networks, which hinders distinguishing between hard and easy samples for computational budget adaptation. It presents a post-hoc uncertainty quantification method that improves predictive performance, showing gains in accuracy, uncertainty capture, and calibration error on datasets like CIFAR-100, ImageNet, and Caltech-256.

Dynamic neural networks are a recent technique that promises a remedy for the increasing size of modern deep learning models by dynamically adapting their computational cost to the difficulty of the inputs. In this way, the model can adjust to a limited computational budget. However, the poor quality of uncertainty estimates in deep learning models makes it difficult to distinguish between hard and easy samples. To address this challenge, we present a computationally efficient approach for post-hoc uncertainty quantification in dynamic neural networks. We show that adequately quantifying and accounting for both aleatoric and epistemic uncertainty through a probabilistic treatment of the last layers improves the predictive performance and aids decision-making when determining the computational budget. In the experiments, we show improvements on CIFAR-100, ImageNet, and Caltech-256 in terms of accuracy, capturing uncertainty, and calibration error.

View on arXiv PDF Code

Similar