LG AI MADec 5, 2024

HyperMARL: Adaptive Hypernetworks for Multi-Agent RL

Kale-ab Abebe Tessera, Arrasy Rahman, Amos Storkey, Stefano V. Albrecht

arXiv:2412.04233v413.49 citationsh-index: 11Has Code

Originality Highly original

AI Analysis

This work solves the problem of maintaining behavioral diversity in MARL for researchers and practitioners, offering a principled alternative to complex existing methods.

The paper tackles the challenge of achieving adaptive cooperation in multi-agent reinforcement learning (MARL) by addressing gradient interference from parameter sharing, and proposes HyperMARL, which uses agent-conditioned hypernetworks to generate agent-specific parameters, achieving competitive performance across 22 scenarios with up to 30 agents while preserving behavioral diversity.

Adaptive cooperation in multi-agent reinforcement learning (MARL) requires policies to express homogeneous, specialised, or mixed behaviours, yet achieving this adaptivity remains a critical challenge. While parameter sharing (PS) is standard for efficient learning, it notoriously suppresses the behavioural diversity required for specialisation. This failure is largely due to cross-agent gradient interference, a problem we find is surprisingly exacerbated by the common practice of coupling agent IDs with observations. Existing remedies typically add complexity through altered objectives, manual preset diversity levels, or sequential updates -- raising a fundamental question: can shared policies adapt without these intricacies? We propose a solution built on a key insight: an agent-conditioned hypernetwork can generate agent-specific parameters and decouple observation- and agent-conditioned gradients, directly countering the interference from coupling agent IDs with observations. Our resulting method, HyperMARL, avoids the complexities of prior work and empirically reduces policy gradient variance. Across diverse MARL benchmarks (22 scenarios, up to 30 agents), HyperMARL achieves performance competitive with six key baselines while preserving behavioural diversity comparable to non-parameter sharing methods, establishing it as a versatile and principled approach for adaptive MARL. The code is publicly available at https://github.com/KaleabTessera/HyperMARL.

View on arXiv PDF Code

Similar