LGPRMLMar 15, 2022

Efficient and Optimal Fixed-Time Regret with Two Experts

arXiv:2203.07577v14 citationsh-index: 27
Originality Incremental advance
AI Analysis

This provides an efficient and optimal solution for online learning with two experts, addressing a specific bottleneck in fixed-time regret minimization.

The paper tackles the problem of prediction with expert advice for two experts with costs in [0,1], proposing an algorithm that achieves optimal regret of sqrt(T/(2π)) + O(1) with O(1) processing time per turn, improving over prior methods that had O(T^2) pre-processing time.

Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is known beforehand. Moreover, this is asymptotically optimal when both $T$ and $n$ grow to infinity. However, when the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist. Cover showed in 1967 a dynamic programming algorithm for the two-experts problem restricted to $\{0,1\}$ costs that suffers at most $\sqrt{T/2π} + O(1)$ regret with $O(T^2)$ pre-processing time. In this work, we propose an optimal algorithm for prediction with two experts' advice that works even for costs in $[0,1]$ and with $O(1)$ processing time per turn. Our algorithm builds up on recent work on the experts problem based on techniques and tools from stochastic calculus.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes