OCLGMay 21, 2019

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

arXiv:1905.08731v135 citations
Originality Incremental advance
AI Analysis

This work addresses coordination and learning in networked multi-agent systems, offering incremental improvements by incorporating sociability and stochastic interactions into bandit algorithms.

The paper tackles a multi-agent multi-armed bandit problem where agents observe neighbors' choices and rewards through stochastic, heterogeneous network interactions, and it designs an algorithm with proven performance bounds that predict agent rankings, verified analytically and computationally.

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an algorithm for each agent to maximize its own expected cumulative reward and prove performance bounds that depend on the sociability of the agents and the network structure. We use the bounds to predict the rank ordering of agents according to their performance and verify the accuracy analytically and computationally.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes