LGMLFeb 8, 2022

Calibrated Learning to Defer with One-vs-All Classifiers

arXiv:2202.03673v279 citations
AI Analysis

This addresses the need for reliable AI systems that can defer to humans when appropriate, with incremental improvements in calibration for safety-critical applications.

The paper tackled the problem of calibration in learning-to-defer systems, finding that an existing multiclass framework produces uncalibrated probabilities, and proposed a one-vs-all classifier system that achieves calibrated probabilities without sacrificing accuracy, often outperforming the baseline in tasks like hate speech detection and skin lesion diagnosis.

The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input, the system can defer the decision to a human if the human is more likely than the model to take the correct action. We study the calibration of L2D systems, investigating if the probabilities they output are sound. We find that Mozannar & Sontag's (2020) multiclass framework is not calibrated with respect to expert correctness. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. We propose an L2D system based on one-vs-all classifiers that is able to produce calibrated probabilities of expert correctness. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar & Sontag's (2020). Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model's accuracy is always comparable (and often superior) to Mozannar & Sontag's (2020) model's in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes