MLLGOct 30, 2023

Dual-Directed Algorithm Design for Efficient Pure Exploration

arXiv:2310.19319v34 citationsh-index: 2
Originality Highly original
AI Analysis

This work addresses the need for efficient adaptive experimentation in pure-exploration problems, offering a framework that extends beyond incremental improvements to resolve open questions and provide broad applicability.

The paper tackled the problem of designing efficient algorithms for pure exploration beyond best-arm identification, such as thresholding bandits and ε-best-arm identification, by introducing a unified algorithm design principle and Information-Directed Selection, resulting in asymptotically optimal algorithms with proven optimality for Gaussian best-arm identification.

While experimental design often focuses on selecting the single best alternative from a finite set (e.g., in ranking and selection or best-arm identification), many pure-exploration problems pursue richer goals. Given a specific goal, adaptive experimentation aims to achieve it by strategically allocating sampling effort, with the underlying sample complexity characterized by a maximin optimization problem. By introducing dual variables, we derive necessary and sufficient conditions for an optimal allocation, yielding a unified algorithm design principle that extends the top-two approach beyond best-arm identification. This principle gives rise to Information-Directed Selection, a hyperparameter-free rule that dynamically evaluates and chooses among candidates based on their current informational value. We prove that, when combined with Information-Directed Selection, top-two Thompson sampling attains asymptotic optimality for Gaussian best-arm identification, resolving a notable open question in the pure-exploration literature. Furthermore, our framework produces asymptotically optimal algorithms for pure-exploration thresholding bandits and $\varepsilon$-best-arm identification (i.e., ranking and selection with probability-of-good-selection guarantees), and more generally establishes a recipe for adapting Thompson sampling across a broad class of pure-exploration problems. Extensive numerical experiments highlight the efficiency of our proposed algorithms compared to existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes