MAAILGMay 20

Decoupling Communication from Policy: Robust MARL under Bandwidth Constraints

arXiv:2605.2108529.8
Predicted impact top 76% in MA · last 90 daysOriginality Incremental advance
AI Analysis

For MARL practitioners in bandwidth-limited domains (e.g., drone swarms), this work provides a principled way to handle communication constraints without sacrificing policy capacity.

The paper addresses bandwidth constraints in multi-agent reinforcement learning (MARL) by introducing a normalized bandwidth budget and a decoupled communication architecture (SLIM). The approach achieves state-of-the-art performance with only marginal degradation as bandwidth is reduced.

Communication enables coordination in multi-agent reinforcement learning (MARL), but many real-world applications, e.g., search-and-rescue with drone swarms, operate under severe bandwidth constraints. Many communication architectures still expose a coupled bottleneck in which a shared latent representation is used for both policy execution and inter-agent communication. Consequently, reducing message size directly limits the policy's latent space, often leading to significant performance degradation. We address this with two contributions. First, we introduce $β$, a normalised per-agent bandwidth budget that unifies sparsity, rounds, and message dimension into a single comparable constraint. Second, we provide SLIM, a minimal architecture that decouples the communication pathway from the policy's latent representation, allowing us to isolate the effect of bandwidth from the effect of policy capacity while benefiting from in-step communication. We evaluate our method on several partially-observable MARL benchmarks, where communication is essential. Our approach achieves state-of-the-art performance and exhibits scalability and robustness under limited communication, with only marginal degradation as bandwidth is reduced.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes