Best-Arm Identification with Noisy Actuation
This addresses a communication bottleneck in distributed bandit systems, but it appears incremental as it builds on existing MAB and channel capacity frameworks.
The paper tackles the problem of identifying the best arm in a multi-armed bandit when commands are transmitted over a noisy channel, and it provides communication schemes whose analysis relates to the zero-error capacity of the channel.
In this paper, we consider a multi-armed bandit (MAB) instance and study how to identify the best arm when arm commands are conveyed from a central learner to a distributed agent over a discrete memoryless channel (DMC). Depending on the agent capabilities, we provide communication schemes along with their analysis, which interestingly relate to the zero-error capacity of the underlying DMC.