AI MANov 16, 2019

Learning Efficient Multi-agent Communication: An Information Bottleneck Approach

Rundong Wang, Xu He, Runsheng Yu, Wei Qiu, Bo An, Zinovi Rabinovich

arXiv:1911.06992v226.9137 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of optimizing communication for multi-agent systems under bandwidth constraints, which is incremental as it builds on existing information bottleneck principles.

The paper tackles the problem of limited-bandwidth communication in multi-agent reinforcement learning by developing an Informative Multi-Agent Communication (IMAC) method that learns efficient communication protocols and scheduling, resulting in faster convergence and more efficient communication compared to baseline methods in various tasks.

We consider the problem of the limited-bandwidth communication for multi-agent reinforcement learning, where agents cooperate with the assistance of a communication protocol and a scheduler. The protocol and scheduler jointly determine which agent is communicating what message and to whom. Under the limited bandwidth constraint, a communication protocol is required to generate informative messages. Meanwhile, an unnecessary communication connection should not be established because it occupies limited resources in vain. In this paper, we develop an Informative Multi-Agent Communication (IMAC) method to learn efficient communication protocols as well as scheduling. First, from the perspective of communication theory, we prove that the limited bandwidth constraint requires low-entropy messages throughout the transmission. Then inspired by the information bottleneck principle, we learn a valuable and compact communication protocol and a weight-based scheduler. To demonstrate the efficiency of our method, we conduct extensive experiments in various cooperative and competitive multi-agent tasks with different numbers of agents and different bandwidths. We show that IMAC converges faster and leads to efficient communication among agents under the limited bandwidth as compared to many baseline methods.

View on arXiv PDF

Similar