MLLGSep 9, 2021

Extreme Bandits using Robust Statistics

arXiv:2109.04433v18 citations
AI Analysis

This addresses a variant of the bandit problem relevant for applications where extremes matter, representing an incremental advance.

The paper tackles the multi-armed bandit problem focusing on extreme values rather than expected values, proposing distribution-free algorithms using robust statistics that achieve vanishing extremal regret under weaker conditions and demonstrate superior performance in numerical experiments.

We consider a multi-armed bandit problem motivated by situations where only the extreme values, as opposed to expected values in the classical bandit setting, are of interest. We propose distribution free algorithms using robust statistics and characterize the statistical properties. We show that the provided algorithms achieve vanishing extremal regret under weaker conditions than existing algorithms. Performance of the algorithms is demonstrated for the finite-sample setting using numerical experiments. The results show superior performance of the proposed algorithms compared to the well known algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes