SCALE-COMM: Shared, Contrastively-Aligned Latent Embeddings for MARL Communication
For multi-agent coordination in partially observable settings, SCALE-COMM addresses the problem of unstable and ungrounded communication, offering a more robust and scalable approach.
SCALE-COMM introduces a self-supervised framework for learning stable, policy-relevant communication representations in multi-agent reinforcement learning, decoupling communication from policy optimization. It consistently outperforms existing methods in representation quality and task performance across benchmarks and a realistic warehouse task, improving stability, sample efficiency, and throughput.
Emergent communication enables partially observant Autonomous Mobile Robots (AMRs) to coordinate effectively in decentralized multi-agent reinforcement learning (MARL) settings. However, existing approaches often struggle with unstable communication protocols, ungrounded message semantics, and interference between communication learning and policy optimization, leading to degraded coordination over time. We propose SCALE-COMM (Shared, Contrastively-Aligned Latent Embeddings for COMMunication), a self-supervised framework for learning compact, stable, and policy-relevant communication representations. SCALE-COMM decouples communication learning from policy optimization by training low-dimensional latent messages that capture task-relevant planning and traffic information, while enforcing consistency across agents and time. Across standard MARL benchmarks and a realistic warehouse coordination task, SCALE-COMM consistently outperforms existing communication frameworks in both representation quality and task performance. The learned communication space yields improved stability, sample efficiency, and throughput under policy fine-tuning, demonstrating the effectiveness of representation-driven communication for scalable multi-agent coordination.