LGDSITMLDec 9, 2020

Streaming Algorithms for Stochastic Multi-armed Bandits

arXiv:2012.05142v15 citations
AI Analysis

This work provides theoretical bounds and algorithms for streaming multi-armed bandit problems under memory constraints, which is relevant for resource-constrained decision-making systems.

This paper addresses the Stochastic Multi-armed Bandit problem with bounded arm-memory, where arms arrive in a stream and only memory-resident arms can be pulled. For regret minimization, they establish an almost tight hardness of Ω(T^{2/3}) cumulative regret for an arm-memory size of (n-1). For best-arm identification, they propose an O(r) arm-memory r-round adaptive streaming algorithm that matches the lower bound on sample complexity, and a heuristic achieving optimal sample complexity by storing one extra arm.

We study the Stochastic Multi-armed Bandit problem under bounded arm-memory. In this setting, the arms arrive in a stream, and the number of arms that can be stored in the memory at any time, is bounded. The decision-maker can only pull arms that are present in the memory. We address the problem from the perspective of two standard objectives: 1) regret minimization, and 2) best-arm identification. For regret minimization, we settle an important open question by showing an almost tight hardness. We show Ω(T^{2/3}) cumulative regret in expectation for arm-memory size of (n-1), where n is the number of arms. For best-arm identification, we study two algorithms. First, we present an O(r) arm-memory r-round adaptive streaming algorithm to find an ε-best arm. In r-round adaptive streaming algorithm for best-arm identification, the arm pulls in each round are decided based on the observed outcomes in the earlier rounds. The best-arm is the output at the end of r rounds. The upper bound on the sample complexity of our algorithm matches with the lower bound for any r-round adaptive streaming algorithm. Secondly, we present a heuristic to find the ε-best arm with optimal sample complexity, by storing only one extra arm in the memory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes