MALGFeb 5, 2025

Double Distillation Network for Multi-Agent Reinforcement Learning

arXiv:2502.03125v13 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses coordination challenges in multi-agent systems for applications like robotics or gaming, but it appears incremental as it builds on existing CTDE frameworks with distillation techniques.

The paper tackles the problem of partial observability impairing collaborative policies in multi-agent reinforcement learning by introducing the Double Distillation Network (DDN), which uses two distillation modules to enhance coordination and exploration, resulting in significant performance improvements across multiple scenarios.

Multi-agent reinforcement learning typically employs a centralized training-decentralized execution (CTDE) framework to alleviate the non-stationarity in environment. However, the partial observability during execution may lead to cumulative gap errors gathered by agents, impairing the training of effective collaborative policies. To overcome this challenge, we introduce the Double Distillation Network (DDN), which incorporates two distillation modules aimed at enhancing robust coordination and facilitating the collaboration process under constrained information. The external distillation module uses a global guiding network and a local policy network, employing distillation to reconcile the gap between global training and local execution. In addition, the internal distillation module introduces intrinsic rewards, drawn from state information, to enhance the exploration capabilities of agents. Extensive experiments demonstrate that DDN significantly improves performance across multiple scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes