LG IT SY OCMar 2, 2022

Linear Stochastic Bandits over a Bit-Constrained Channel

Aritra Mitra, Hamed Hassani, George J. Pappas

arXiv:2203.01198v111.19 citationsh-index: 98

Originality Highly original

AI Analysis

This addresses the challenge of large-scale distributed learning with stringent communication constraints, representing a significant first step in statistical decision-making over finite-capacity channels.

The paper tackles the problem of sequential decision-making under communication constraints by introducing a linear stochastic bandit formulation over a bit-constrained channel, proving that a channel capacity of O(d) bits suffices to achieve order-optimal regret for d-dimensional models and 1 bit for unstructured multi-armed bandits.

One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components: (i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is $d$-dimensional, a channel capacity of $O(d)$ bits suffices to achieve order-optimal regret. To demonstrate the generality of our approach, we then show that the same result continues to hold for non-linear observation models satisfying standard regularity conditions. Finally, we establish that for the simpler unstructured multi-armed bandit problem, $1$ bit channel-capacity is sufficient for achieving optimal regret bounds. Overall, our work takes a significant first step towards paving the way for statistical decision-making over finite-capacity channels.

View on arXiv PDF

Similar