LGMay 24, 2023

On the Minimax Regret for Online Learning with Feedback Graphs

arXiv:2305.15383v28 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap in online learning algorithms for researchers, offering refined bounds that are incremental but important for understanding feedback graph structures.

The paper tackles the problem of improving regret bounds for online learning with strongly observable undirected feedback graphs, achieving an upper bound of O(√(αT(1+ln(K/α)))) that matches known lower bounds for bandits and experts cases, and provides an improved lower bound of Ω(√(αT(ln K)/(ln α))) for α > 1.

In this work, we improve on the upper and lower bounds for the regret of online learning with strongly observable undirected feedback graphs. The best known upper bound for this problem is $\mathcal{O}\bigl(\sqrt{αT\ln K}\bigr)$, where $K$ is the number of actions, $α$ is the independence number of the graph, and $T$ is the time horizon. The $\sqrt{\ln K}$ factor is known to be necessary when $α= 1$ (the experts case). On the other hand, when $α= K$ (the bandits case), the minimax rate is known to be $Θ\bigl(\sqrt{KT}\bigr)$, and a lower bound $Ω\bigl(\sqrt{αT}\bigr)$ is known to hold for any $α$. Our improved upper bound $\mathcal{O}\bigl(\sqrt{αT(1+\ln(K/α))}\bigr)$ holds for any $α$ and matches the lower bounds for bandits and experts, while interpolating intermediate cases. To prove this result, we use FTRL with $q$-Tsallis entropy for a carefully chosen value of $q \in [1/2, 1)$ that varies with $α$. The analysis of this algorithm requires a new bound on the variance term in the regret. We also show how to extend our techniques to time-varying graphs, without requiring prior knowledge of their independence numbers. Our upper bound is complemented by an improved $Ω\bigl(\sqrt{αT(\ln K)/(\lnα)}\bigr)$ lower bound for all $α> 1$, whose analysis relies on a novel reduction to multitask learning. This shows that a logarithmic factor is necessary as soon as $α< K$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes