Scalable Communication for Multi-Agent Reinforcement Learning via Transformer-Based Email Mechanism
This addresses scalability issues in multi-agent systems for researchers and practitioners, though it is incremental as it builds on existing communication methods with a novel forwarding mechanism.
The paper tackles the scalability problem of communication in multi-agent reinforcement learning for partially-observed tasks by proposing a Transformer-based Email Mechanism (TEM), which uses local communication and message forwarding to improve cooperation without modeling all agents, and it outperforms baselines on benchmarks while maintaining performance when the number of agents varies without retraining.
Communication can impressively improve cooperation in multi-agent reinforcement learning (MARL), especially for partially-observed tasks. However, existing works either broadcast the messages leading to information redundancy, or learn targeted communication by modeling all the other agents as targets, which is not scalable when the number of agents varies. In this work, to tackle the scalability problem of MARL communication for partially-observed tasks, we propose a novel framework Transformer-based Email Mechanism (TEM). The agents adopt local communication to send messages only to the ones that can be observed without modeling all the agents. Inspired by human cooperation with email forwarding, we design message chains to forward information to cooperate with the agents outside the observation range. We introduce Transformer to encode and decode the message chain to choose the next receiver selectively. Empirically, TEM outperforms the baselines on multiple cooperative MARL benchmarks. When the number of agents varies, TEM maintains superior performance without further training.