LGMay 13

Collaborating in Multi-Armed Bandits with Strategic Agents

arXiv:2605.1314534.7
AI Analysis

For multi-agent bandit systems with persistent strategic agents, this work shows that collaboration can be sustained through information sharing alone, addressing a key incentive problem.

The paper studies collaborative learning in multi-agent Bayesian bandits where strategic agents may free-ride. It proposes CAOS, a mechanism that sustains collaboration as a Nash equilibrium, achieving regret close to fully cooperative systems.

We study collaborative learning in multi-agent Bayesian bandit problems, where strategic agents collectively solve the same bandit instance. While multiple agents can accelerate learning by sharing information, strategic agents might prefer to free-ride and avoid exploration. We consider a setting with persistent agents that participate in multiple time periods. This is in contrast to most previous works on incentives in multi-agent MAB, which assume short-lived agents, namely each agent has a single decision to make and optimizes their expected reward in that single decision. As in the multi-agent MAB model with incentives, our model does not have monetary transfers, and the only incentives are through information sharing. We propose \texttt{CAOS}, a mechanism that sustains collaboration as a Nash equilibrium while achieving strong regret guarantees. Our results demonstrate that collaborative exploration can be sustained purely through information sharing, achieving performance close to that of fully cooperative systems despite strategic behavior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes