AILGMASYMLSep 22, 2021

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

arXiv:2109.10632v111 citations
Originality Incremental advance
AI Analysis

This addresses scalability problems for researchers and practitioners in multi-agent systems, though it is incremental as it builds on existing paradigms.

The paper tackles scalability issues in cooperative multi-agent reinforcement learning by exploiting locality structures, proposing the LOMAQ algorithm with local rewards, and shows it improves performance and convergence speed compared to other methods.

Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. As environments grow in size, effective credit assignment becomes increasingly harder and often results in infeasible learning times. Still, in many real-world settings, there exist simplified underlying dynamics that can be leveraged for more scalable solutions. In this work, we exploit such locality structures effectively whilst maintaining global cooperation. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm. Additionally, we provide a direct reward decomposition method for finding these local rewards when only a global signal is provided. We test our method empirically, showing it scales well compared to other methods, significantly improving performance and convergence speed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes