ML LG STFeb 27, 2017

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz

arXiv:1702.08211v214.820 citations

Originality Incremental advance

AI Analysis

This work addresses challenges in online learning for applications like auctions, though it appears incremental as it builds on existing feedback models with algorithmic refinements.

The paper tackles contextual online learning with nonparametric classes under varying feedback assumptions, achieving the first explicit algorithm with minimax regret rates for full information and Lipschitz losses, and improving regret bounds for partial feedback models like second-price auctions.

We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information. For full information feedback and Lipschitz losses, we design the first explicit algorithm achieving the minimax regret rate (up to log factors). In a partial feedback model motivated by second-price auctions, we obtain algorithms for Lipschitz and semi-Lipschitz losses with regret bounds improving on the known bounds for standard bandit feedback. Our analysis combines novel results for contextual second-price auctions with a novel algorithmic approach based on chaining. When the context space is Euclidean, our chaining approach is efficient and delivers an even better regret bound.

View on arXiv PDF

Similar