LG MLApr 20

Online Conformal Prediction with Adversarial Semi-bandit Feedback via Regret Minimization

Junyoung Yang, Kyungmin Kim, Sangdon Park

arXiv:2604.1798435.92 citationsh-index: 3

Predicted impact top 67% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For safety-critical systems requiring online uncertainty quantification under partial feedback, this work provides the first principled method with coverage guarantees in an adversarial setting.

This paper tackles online uncertainty quantification with partial feedback, where the true label is only observed when it falls inside the prediction set. The proposed method, based on adversarial bandits, achieves long-run coverage guarantees and empirically controls miscoverage rate while maintaining reasonable prediction set size.

Uncertainty quantification is crucial in safety-critical systems, where decisions must be made under uncertainty. In particular, we consider the problem of online uncertainty quantification, where data points arrive sequentially. Online conformal prediction is a principled online uncertainty quantification method that dynamically constructs a prediction set at each time step. While existing methods for online conformal prediction provide long-run coverage guarantees without any distributional assumptions, they typically assume a full feedback setting in which the true label is always observed. In this paper, we propose a novel learning method for online conformal prediction with partial feedback from an adaptive adversary-a more challenging setup where the true label is revealed only when it lies inside the constructed prediction set. Specifically, we formulate online conformal prediction as an adversarial bandit problem by treating each candidate prediction set as an arm. Building on an existing algorithm for adversarial bandits, our method achieves a long-run coverage guarantee by explicitly establishing its connection to the regret of the learner. Finally, we empirically demonstrate the effectiveness of our method in both independent and identically distributed (i.i.d.) and non-i.i.d. settings, showing that it successfully controls the miscoverage rate while maintaining a reasonable size of the prediction set.

View on arXiv PDF

Similar