LGAIJun 20, 2022

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

arXiv:2206.10607v153 citationsh-index: 9
Originality Highly original
AI Analysis

This addresses the problem of sparse rewards in multi-agent systems for researchers and practitioners, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles cooperative multi-agent reinforcement learning with sparse rewards by proposing MASER, a method that generates subgoals from an experience replay buffer and designs intrinsic rewards, resulting in significant outperformance on the StarCraft II micromanagement benchmark compared to state-of-the-art algorithms.

In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes