MLMar 27, 2017

A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis

arXiv:1703.08937v121 citations
Originality Incremental advance
AI Analysis

This addresses the need for robust bandit algorithms in scenarios with unknown noise distributions, though it is incremental as it builds on prior specialized cases.

The paper tackles the problem of designing a scale-free algorithm for stochastic bandits, generalizing previous specialized results to a non-parametric setup where only a bound on the kurtosis of the noise is known, achieving results that do not require prior knowledge of scale parameters.

Existing strategies for finite-armed stochastic bandits mostly depend on a parameter of scale that must be known in advance. Sometimes this is in the form of a bound on the payoffs, or the knowledge of a variance or subgaussian parameter. The notable exceptions are the analysis of Gaussian bandits with unknown mean and variance by Cowan and Katehakis [2015] and of uniform distributions with unknown support [Cowan and Katehakis, 2015]. The results derived in these specialised cases are generalised here to the non-parametric setup, where the learner knows only a bound on the kurtosis of the noise, which is a scale free measure of the extremity of outliers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes