Selective Learning: Towards Robust Calibration with Dynamic Regularization
This addresses miscalibration in deep learning models, which is a domain-specific issue for improving reliability in predictions, though it appears incremental as it builds on existing regularization approaches.
The paper tackles the problem of miscalibration in deep learning, where models make overconfident predictions due to overfitting, by introducing Dynamic Regularization (DReg) to selectively learn from in-distribution samples and dynamically regularize outliers, resulting in a robust calibrated model with demonstrated superiority over previous methods.
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance. This problem usually arises due to the overfitting problem, which is characterized by learning everything presented in the training set, resulting in overconfident predictions during testing. Existing methods typically address overfitting and mitigate the miscalibration by adding a maximum-entropy regularizer to the objective function. The objective can be understood as seeking a model that fits the ground-truth labels by increasing the confidence while also maximizing the entropy of predicted probabilities by decreasing the confidence. However, previous methods lack clear guidance on confidence adjustment, leading to conflicting objectives (increasing but also decreasing confidence). Therefore, we introduce a method called Dynamic Regularization (DReg), which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off. At a high level, DReg aims to obtain a more reliable model capable of acknowledging what it knows and does not know. Specifically, DReg effectively fits the labels for in-distribution samples (samples that should be learned) while applying regularization dynamically to samples beyond model capabilities (e.g., outliers), thereby obtaining a robust calibrated model especially on the samples beyond model capabilities. Both theoretical and empirical analyses sufficiently demonstrate the superiority of DReg compared with previous methods.