Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning
For multi-agent reinforcement learning practitioners, this work addresses the vulnerability of learned coordination to interaction-structure perturbations, offering a more robust training method.
The paper proposes an interaction-breaking adversarial learning framework to improve robustness in multi-agent reinforcement learning under attacks that disrupt inter-agent coordination, achieving stronger performance than existing baselines across diverse attack settings and agent-missing scenarios.
Cooperation is central to multi-agent reinforcement learning (MARL), yet learned coordination can be fragile when external perturbations disrupt inter-agent interactions. Prior robust MARL methods have primarily considered value-oriented attacks, leaving a gap in robustness when interaction structures themselves are corrupted. In this paper, we propose an interaction-breaking adversarial learning (IBAL) framework that takes an information-theoretic view to construct attacks that impede coordination by perturbing agents' observations and actions, and trains agents to perform reliably under such disruptions. Empirically, our approach improves robustness over existing robust MARL baselines across diverse attack settings and yields stronger performance even under agent-missing scenarios.