LG AIMay 22

Enhancing Deep Neural Network Reliability with Refinement and Calibration

Ramya Hebbalaguppe, Ajay Shastry, Soumya Suvra Ghosal, Chetan Arora

arXiv:2605.2324948.4

Predicted impact top 50% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners deploying DNNs in high-stakes applications, this work improves reliability by balancing calibration and refinement, addressing a known limitation of existing calibration methods.

The paper addresses the problem of unreliable confidence estimates in deep neural networks by proposing a loss function for refinement and a unified training framework (RefCal) that jointly optimizes calibration, refinement, and accuracy. On CIFAR-100-LT with 10% imbalance, RefCal achieves (accuracy, refinement, ECE) of (58.81, 95.67, 0.08), substantially outperforming Correctness Ranking Loss (46.27, 93.7, 0.22).

Although deep neural networks (DNNs) achieve high predictive accuracy, their confidence estimates are often unreliable, potentially compromising user trust in their decisions. This has motivated research on calibrated models, where calibration measures how well a model's predicted confidence aligns with the empirical probability of correctness. However, calibration metrics can often be improved through post-processing techniques that merely mimic training-time uncertainty without genuinely improving the model's understanding. For this reason, statisticians recommend that models be not only calibrated but also refined. Intuitively, a model is considered more refined if it assigns significantly different confidence scores to correct and incorrect predictions, a property also referred to as sharpness. We observe that many existing calibration methods improve calibration at the cost of reduced refinement. To address this limitation, we propose: (1) a novel loss function that explicitly promotes refinement and can be optimized through supervised contrastive learning; and (2) a unified training framework, RefCal, that jointly optimizes calibration, refinement, and accuracy to improve DNN reliability. On the CIFAR-100-LT dataset with 10 percent class imbalance, RefCal achieves (accuracy, refinement, ECE) of (58.81, 95.67, 0.08), substantially outperforming the widely used Correctness Ranking Loss, which achieves (46.27, 93.7, 0.22).

View on arXiv PDF

Similar