Calibrating a Deep Neural Network with Its Predecessors
This work addresses the calibration issue for safety-critical AI applications, offering a novel method that improves upon existing techniques like early stopping.
The paper tackles the problem of confidence calibration in deep neural networks, which is crucial for safety-critical applications, by proposing a novel regularization method called predecessor combination search (PCS) that achieves state-of-the-art calibration performance on multiple datasets and architectures and improves model robustness under dataset distribution shift.
Confidence calibration - the process to calibrate the output probability distribution of neural networks - is essential for safety-critical applications of such networks. Recent works verify the link between mis-calibration and overfitting. However, early stopping, as a well-known technique to mitigate overfitting, fails to calibrate networks. In this work, we study the limitions of early stopping and comprehensively analyze the overfitting problem of a network considering each individual block. We then propose a novel regularization method, predecessor combination search (PCS), to improve calibration by searching a combination of best-fitting block predecessors, where block predecessors are the corresponding network blocks with weight parameters from earlier training stages. PCS achieves the state-of-the-art calibration performance on multiple datasets and architectures. In addition, PCS improves model robustness under dataset distribution shift.