Conformalized Credal Set Predictors
This work addresses uncertainty representation in ambiguous tasks like natural language inference, where multiple annotations per example are common, but it is incremental as it builds on existing conformal prediction techniques.
The paper tackles the challenge of learning credal set predictors for uncertainty representation in machine learning by proposing a method based on conformal prediction for classification tasks with training data labeled by probability distributions, resulting in conformal credal sets that are guaranteed to be valid with high probability without assumptions on model or distribution.
Credal sets are sets of probability distributions that are considered as candidates for an imprecisely known ground-truth distribution. In machine learning, they have recently attracted attention as an appealing formalism for uncertainty representation, in particular due to their ability to represent both the aleatoric and epistemic uncertainty in a prediction. However, the design of methods for learning credal set predictors remains a challenging problem. In this paper, we make use of conformal prediction for this purpose. More specifically, we propose a method for predicting credal sets in the classification task, given training data labeled by probability distributions. Since our method inherits the coverage guarantees of conformal prediction, our conformal credal sets are guaranteed to be valid with high probability (without any assumptions on model or distribution). We demonstrate the applicability of our method to natural language inference, a highly ambiguous natural language task where it is common to obtain multiple annotations per example.