OCAILGMASYJun 11, 2020

Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward

arXiv:2006.06626v190 citations
Originality Highly original
AI Analysis

This addresses scalability issues for researchers and practitioners in MARL dealing with large networked systems, representing a novel method for a known bottleneck rather than incremental.

The paper tackles the scalability problem in multi-agent reinforcement learning (MARL) for networked systems by proposing a Scalable Actor-Critic method that learns near-optimal localized policies with complexity scaling based on local neighborhoods rather than the entire network, achieving exponential decay in agent effects with graph distance.

It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically, we propose a Scalable Actor-Critic (SAC) method that can learn a near optimal localized policy for optimizing the average reward with complexity scaling with the state-action space size of local neighborhoods, as opposed to the entire network. Our result centers around identifying and exploiting an exponential decay property that ensures the effect of agents on each other decays exponentially fast in their graph distance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes