LGDCSIMLJul 2, 2020

Multi-Agent Low-Dimensional Linear Bandits

arXiv:2007.01442v411 citations
AI Analysis

This work addresses the challenge of efficient collaboration in multi-agent bandit systems, which is incremental as it builds on existing linear bandit methods by incorporating decentralized communication.

The paper tackles the problem of multi-agent stochastic linear bandits with side information, where agents collaborate via communication to reduce regret by searching for an optimal low-dimensional subspace, resulting in significantly smaller per-agent regret compared to non-communicating agents.

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $θ^* \in \mathbb{R}^d$. The side information consists of a finite collection of low-dimensional subspaces, one of which contains $θ^*$. In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low-dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes