LGDCMASIMLMay 30, 2023

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

arXiv:2305.18784v27 citations
AI Analysis

This work addresses coordination challenges in multi-agent systems for applications like distributed sensing or recommendation, though it appears incremental as it extends existing collaborative bandit frameworks to heterogeneous settings.

The paper tackles the problem of collaborative multi-agent learning across heterogeneous bandits, developing decentralized algorithms that achieve near-optimal group regret bounds.

The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of $N$ agents such that each agent is learning one of $M$ stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes