LINDA: Multi-Agent Local Information Decomposition for Awareness of Teammates
This addresses the issue of inefficient collaboration in multi-agent systems with partial observations, though it is an incremental improvement over existing MARL methods.
The paper tackles the problem of partial observability in cooperative multi-agent reinforcement learning by proposing LINDA, a framework that enables agents to decompose local information to build awareness of teammates, resulting in significant performance improvements on challenging tasks.
In cooperative multi-agent reinforcement learning (MARL), where agents only have access to partial observations, efficiently leveraging local information is critical. During long-time observations, agents can build \textit{awareness} for teammates to alleviate the problem of partial observability. However, previous MARL methods usually neglect this kind of utilization of local information. To address this problem, we propose a novel framework, multi-agent \textit{Local INformation Decomposition for Awareness of teammates} (LINDA), with which agents learn to decompose local information and build awareness for each teammate. We model the awareness as stochastic random variables and perform representation learning to ensure the informativeness of awareness representations by maximizing the mutual information between awareness and the actual trajectory of the corresponding agent. LINDA is agnostic to specific algorithms and can be flexibly integrated to different MARL methods. Sufficient experiments show that the proposed framework learns informative awareness from local partial observations for better collaboration and significantly improves the learning performance, especially on challenging tasks.