LGMay 6, 2023

An improved regret analysis for UCB-N and TS-N

arXiv:2305.04093v12.0

Originality Synthesis-oriented

AI Analysis

This work provides a tighter theoretical bound for algorithms in online learning, which is incremental as it refines prior analysis without introducing new methods.

The authors improved the pseudo-regret analysis for UCB-N and TS-N algorithms in stochastic online learning with undirected feedback graphs, replacing a log(T) factor with log2(α) + 3, where α is the independence number of the graph.

In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N. In this note, we show how to improve their pseudo-regret analysis. Our improvement involves refining a key lemma of the previous analysis, allowing a $\log(T)$ factor to be replaced by a factor $\log_2(α) + 3$ for $α$ the independence number of the feedback graph.

View on arXiv PDF

Similar