LG OC MLNov 14, 2025

A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

arXiv:2511.11211v17.11 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for researchers in online learning and bandit algorithms, focusing on proof simplification rather than new algorithmic results.

The paper tackles simplifying the proof for the Tsallis-INF multi-armed bandit algorithm by deriving a best-of-both-worlds guarantee without using Fenchel conjugates, resulting in a slimmer proof that avoids optimizing constants.

In this short note, we present a simple derivation of the best-of-both-world guarantee for the Tsallis-INF multi-armed bandit algorithm from J. Zimmert and Y. Seldin. Tsallis-INF: An optimal algorithm for stochastic and adversarial bandits. Journal of Machine Learning Research, 22(28):1-49, 2021. URL https://jmlr.csail.mit.edu/papers/volume22/19-753/19-753.pdf. In particular, the proof uses modern tools from online convex optimization and avoid the use of conjugate functions. Also, we do not optimize the constants in the bounds in favor of a slimmer proof.

View on arXiv PDF

Similar