LGMar 9, 2017

Online Learning with Abstention

arXiv:1703.03478v353 citations
AI Analysis

This work addresses the challenge of abstention in online learning, which is incremental as it builds on prior algorithms but introduces a novel method for handling time-varying feedback in stochastic contexts.

The paper tackles the problem of online learning with abstention, where algorithms can skip predictions, by adapting existing methods for adversarial settings and introducing a new algorithm, UCB-GT, for stochastic settings with time-varying feedback graphs, showing it outperforms baselines with more favorable regret guarantees.

We present an extensive study of the key problem of online learning where algorithms are allowed to abstain from making predictions. In the adversarial setting, we show how existing online algorithms and guarantees can be adapted to this problem. In the stochastic setting, we first point out a bias problem that limits the straightforward extension of algorithms such as UCB-N to time-varying feedback graphs, as needed in this context. Next, we give a new algorithm, UCB-GT, that exploits historical data and is adapted to time-varying feedback graphs. We show that this algorithm benefits from more favorable regret guarantees than a possible, but limited, extension of UCB-N. We further report the results of a series of experiments demonstrating that UCB-GT largely outperforms that extension of UCB-N, as well as more standard baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes