MA AISep 10, 2019

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Weinan Zhang, Qing Wang, Yong Yu

arXiv:1909.04224v26.610 citations

Originality Incremental advance

AI Analysis

This addresses the coordination problem in multi-agent systems for applications like robotics or gaming, though it is incremental as it builds on existing MARL frameworks.

The paper tackles the limitation of decentralized execution in multi-agent reinforcement learning by introducing a coordination signal to enable better collaboration, and shows that their proposed Signal Instructed Coordination (SIC) module consistently improves performance over existing models in matrix and predator-prey games.

In many real-world problems, a team of agents need to collaborate to maximize the common reward. Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate. Inspired by the concept of correlated equilibrium, we propose to introduce a coordination signal to address this limitation, and theoretically show that following mild conditions, decentralized agents with the coordination signal can coordinate their individual policies as manipulated by a centralized controller. The idea of introducing coordination signal is to encapsulate coordinated strategies into the signals, and use the signals to instruct the collaboration in decentralized execution. To encourage agents to learn to exploit the coordination signal, we propose Signal Instructed Coordination (SIC), a novel coordination module that can be integrated with most existing MARL frameworks. SIC casts a common signal sampled from a pre-defined distribution to all agents, and introduces an information-theoretic regularization to facilitate the consistency between the observed signal and agents' policies. Our experiments show that SIC consistently improves performance over well-recognized MARL models in both matrix games and a predator-prey game with high-dimensional strategy space.

View on arXiv PDF

Similar