Loss Estimators Improve Model Generalization
This work addresses the critical need for safe deployment of AI tools in clinical diagnosis by enhancing model generalization and uncertainty estimation, though it appears incremental as it builds on existing uncertainty estimation methods.
The paper tackles the problem of ensuring AI models for clinical diagnosis generalize safely by not making predictions in data regimes unsupported by training evidence, and proposes training a loss estimator with a contrastive objective to directly estimate prediction uncertainties, which improves model generalization and uncertainty calibration.
With increased interest in adopting AI methods for clinical diagnosis, a vital step towards safe deployment of such tools is to ensure that the models not only produce accurate predictions but also do not generalize to data regimes where the training data provide no meaningful evidence. Existing approaches for ensuring the distribution of model predictions to be similar to that of the true distribution rely on explicit uncertainty estimators that are inherently hard to calibrate. In this paper, we propose to train a loss estimator alongside the predictive model, using a contrastive training objective, to directly estimate the prediction uncertainties. Interestingly, we find that, in addition to producing well-calibrated uncertainties, this approach improves the generalization behavior of the predictor. Using a dermatology use-case, we show the impact of loss estimators on model generalization, in terms of both its fidelity on in-distribution data and its ability to detect out of distribution samples or new classes unseen during training.