Online Learning with Set-Valued Feedback
This work addresses theoretical gaps in online learning for scenarios where feedback is ambiguous, such as multilabel ranking, with incremental contributions to foundational ML theory.
The paper tackles online multiclass classification with set-valued feedback, showing that deterministic and randomized online learnability are not equivalent in the realizable setting, and introduces new combinatorial dimensions (Set Littlestone and Measure Shattering) that tightly characterize learnability and quantify minimax regret.
We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback. In this model, the learner is penalized for not outputting a label contained in the revealed set. We show that unlike online multiclass learning with single-label feedback, deterministic and randomized online learnability are \textit{not equivalent} even in the realizable setting with set-valued feedback. Accordingly, we give two new combinatorial dimensions, named the Set Littlestone and Measure Shattering dimension, that tightly characterize deterministic and randomized online learnability respectively in the realizable setting. In addition, we show that the Measure Shattering dimension characterizes online learnability in the agnostic setting and tightly quantifies the minimax regret. Finally, we use our results to establish bounds on the minimax regret for three practical learning settings: online multilabel ranking, online multilabel classification, and real-valued prediction with interval-valued response.