MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning
This work addresses communication bottlenecks in MARL for tasks like collective execution, though it is incremental as it builds on existing decomposition methods.
The paper tackles the problem of complex and non-differentiable communication protocols in multi-agent reinforcement learning by introducing a self-attention-based module that is fully differentiable and integrates with existing methods, achieving state-of-the-art performance on SMAC and SMACv2 benchmarks.
Communication is essential for the collective execution of complex tasks by human agents, motivating interest in communication mechanisms for multi-agent reinforcement learning (MARL). However, existing communication protocols in MARL are often complex and non-differentiable. In this work, we introduce a self-attention-based communication module that exchanges information between the agents in MARL. Our proposed approach is fully differentiable, allowing agents to learn to generate messages in a reward-driven manner. The module can be seamlessly integrated with any action-value function decomposition method and can be viewed as an extension of such decompositions. Notably, it includes a fixed number of trainable parameters, independent of the number of agents. Experimental results on the SMAC and SMACv2 benchmarks demonstrate the effectiveness of our approach, which achieves state-of-the-art performance on a number of maps.