LGMar 9, 2017

Online Learning with Abstention

Corinna Cortes, Giulia DeSalvo, Claudio Gentile, Mehryar Mohri, Scott Yang

arXiv:1703.03478v312.254 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of abstention in online learning, which is incremental as it builds on prior algorithms but introduces a novel method for handling time-varying feedback in stochastic contexts.

The paper tackles the problem of online learning with abstention, where algorithms can skip predictions, by adapting existing methods for adversarial settings and introducing a new algorithm, UCB-GT, for stochastic settings with time-varying feedback graphs, showing it outperforms baselines with more favorable regret guarantees.

We present an extensive study of the key problem of online learning where algorithms are allowed to abstain from making predictions. In the adversarial setting, we show how existing online algorithms and guarantees can be adapted to this problem. In the stochastic setting, we first point out a bias problem that limits the straightforward extension of algorithms such as UCB-N to time-varying feedback graphs, as needed in this context. Next, we give a new algorithm, UCB-GT, that exploits historical data and is adapted to time-varying feedback graphs. We show that this algorithm benefits from more favorable regret guarantees than a possible, but limited, extension of UCB-N. We further report the results of a series of experiments demonstrating that UCB-GT largely outperforms that extension of UCB-N, as well as more standard baselines.

View on arXiv PDF

Similar