Apple Tasting: Combinatorial Dimensions and Minimax Rates
This work addresses a classical partial-feedback problem in online learning, providing foundational insights into combinatorial dimensions and minimax rates, with implications for theoretical machine learning.
The paper tackles the problem of online binary classification under apple tasting feedback, where the learner only observes the true label when predicting '1', and shows that the Littlestone dimension characterizes learnability in the agnostic setting, while a new combinatorial parameter called Effective width establishes a trichotomy of minimax expected mistakes as Θ(1), Θ(√T), or Θ(T) in the realizable setting.
In online binary classification under \emph{apple tasting} feedback, the learner only observes the true label if it predicts ``1". First studied by \cite{helmbold2000apple}, we revisit this classical partial-feedback setting and study online learnability from a combinatorial perspective. We show that the Littlestone dimension continues to provide a tight quantitative characterization of apple tasting in the agnostic setting, closing an open question posed by \cite{helmbold2000apple}. In addition, we give a new combinatorial parameter, called the Effective width, that tightly quantifies the minimax expected mistakes in the realizable setting. As a corollary, we use the Effective width to establish a \emph{trichotomy} of the minimax expected number of mistakes in the realizable setting. In particular, we show that in the realizable setting, the expected number of mistakes of any learner, under apple tasting feedback, can be $Θ(1), Θ(\sqrt{T})$, or $Θ(T)$. This is in contrast to the full-information realizable setting where only $Θ(1)$ and $Θ(T)$ are possible.