MLLGMar 13, 2018

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

arXiv:1803.04665v137 citations
Originality Incremental advance
AI Analysis

This work extends bandit models to broader problems with fewer assumptions, though it is incremental in refining existing frameworks.

The paper tackles the problem of identifying a near-optimal arm in infinitely-armed bandit models with fixed confidence, without assumptions about the arm distribution, by deriving a sample complexity lower bound and proposing an algorithm that achieves this with high probability, matching the lower bound up to a log factor.

We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our log^2(1/delta) dependence is inescapable for "two-phase" (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes