LGFeb 13, 2023

Calibrating a Deep Neural Network with Its Predecessors

arXiv:2302.06245v210 citationsh-index: 49
AI Analysis

This work addresses the calibration issue for safety-critical AI applications, offering a novel method that improves upon existing techniques like early stopping.

The paper tackles the problem of confidence calibration in deep neural networks, which is crucial for safety-critical applications, by proposing a novel regularization method called predecessor combination search (PCS) that achieves state-of-the-art calibration performance on multiple datasets and architectures and improves model robustness under dataset distribution shift.

Confidence calibration - the process to calibrate the output probability distribution of neural networks - is essential for safety-critical applications of such networks. Recent works verify the link between mis-calibration and overfitting. However, early stopping, as a well-known technique to mitigate overfitting, fails to calibrate networks. In this work, we study the limitions of early stopping and comprehensively analyze the overfitting problem of a network considering each individual block. We then propose a novel regularization method, predecessor combination search (PCS), to improve calibration by searching a combination of best-fitting block predecessors, where block predecessors are the corresponding network blocks with weight parameters from earlier training stages. PCS achieves the state-of-the-art calibration performance on multiple datasets and architectures. In addition, PCS improves model robustness under dataset distribution shift.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes