MLLGNov 28, 2018

Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals

arXiv:1811.11419v2151 citations
Originality Incremental advance
AI Analysis

This work provides theoretical tools for sequential decision-making in bandit problems, which is incremental but offers specific improvements in confidence interval construction.

The paper tackles the problem of deriving uniform deviation inequalities for adaptive sampling in multi-armed bandits, using Kullback-Leibler divergence in exponential families, and applies these to analyze sequential tests and construct tight confidence intervals for arm means.

This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes