LGJun 24, 2025

Ambiguous Online Learning

arXiv:2506.19810v1h-index: 3

Originality Incremental advance

AI Analysis

This work addresses a theoretical problem in online learning with applications to multivalued systems, recommendation algorithms, and compression, but it appears incremental as it extends existing concepts like apple tasting.

The paper tackles the problem of ambiguous online learning, where learners can produce multiple predicted labels, and shows that optimal mistake bounds for any hypothesis class fall into one of three categories: constant, square root of the number of rounds, or linear in the number of rounds.

We propose a new variant of online learning that we call "ambiguous online learning". In this setting, the learner is allowed to produce multiple predicted labels. Such an "ambiguous prediction" is considered correct when at least one of the labels is correct, and none of the labels are "predictably wrong". The definition of "predictably wrong" comes from a hypothesis class in which hypotheses are also multi-valued. Thus, a prediction is "predictably wrong" if it's not allowed by the (unknown) true hypothesis. In particular, this setting is natural in the context of multivalued dynamical systems, recommendation algorithms and lossless compression. It is also strongly related to so-called "apple tasting". We show that in this setting, there is a trichotomy of mistake bounds: up to logarithmic factors, any hypothesis class has an optimal mistake bound of either Theta(1), Theta(sqrt(N)) or N.

View on arXiv PDF

Similar