MLLGJan 22, 2025

Fixed-Budget Change Point Identification in Piecewise Constant Bandits

arXiv:2501.12957v12 citationsh-index: 9AISTATS
Originality Incremental advance
AI Analysis

This addresses a specific challenge in bandit algorithms for change point detection, offering incremental improvements in theoretical guarantees and practical performance.

The paper tackles the problem of locating a change point in piecewise constant bandits with a fixed exploration budget, providing the first non-asymptotic analysis and developing algorithms with near-optimal error bounds for both small and large budget regimes, including a regime-adaptive algorithm validated by experiments.

We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner's aim is to locate the change point. Under the assumption of a fixed exploration budget, we provide the first non-asymptotic analysis of policies designed to locate abrupt changes in the mean reward function under bandit feedback. We study the problem under a large and small budget regime, and for both settings establish lower bounds on the error probability and provide algorithms with near matching upper bounds. Interestingly, our results show a separation in the complexity of the two regimes. We then propose a regime adaptive algorithm which is near optimal for both small and large budgets simultaneously. We complement our theoretical analysis with experimental results in simulated environments to support our findings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes