LGAIOct 4, 2023

$(ε, u)$-Adaptive Regret Minimization in Heavy-Tailed Bandits

arXiv:2310.02975v2h-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of robust decision-making in uncertain environments like finance for researchers, but it is incremental as it builds on prior heavy-tailed bandit work.

The paper tackles regret minimization in heavy-tailed bandits with unknown moment parameters, showing that adaptation incurs a cost and proposing AdaR-UCB, an algorithm that achieves regret guarantees nearly matching the non-adaptive case.

Heavy-tailed distributions naturally arise in several settings, from finance to telecommunications. While regret minimization under subgaussian or bounded rewards has been widely studied, learning with heavy-tailed distributions only gained popularity over the last decade. In this paper, we consider the setting in which the reward distributions have finite absolute raw moments of maximum order $1+ε$, uniformly bounded by a constant $u<+\infty$, for some $ε\in (0,1]$. In this setting, we study the regret minimization problem when $ε$ and $u$ are unknown to the learner and it has to adapt. First, we show that adaptation comes at a cost and derive two negative results proving that the same regret guarantees of the non-adaptive case cannot be achieved with no further assumptions. Then, we devise and analyze a fully data-driven trimmed mean estimator and propose a novel adaptive regret minimization algorithm, AdaR-UCB, that leverages such an estimator. Finally, we show that AdaR-UCB is the first algorithm that, under a known distributional assumption, enjoys regret guarantees nearly matching those of the non-adaptive heavy-tailed case.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes