MALGJun 4, 2023

A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning

arXiv:2306.02430v13 citationsh-index: 32
Originality Incremental advance
AI Analysis

This work addresses the problem of partial observability and policy changes in multi-agent systems for researchers in reinforcement learning, though it appears incremental as it builds on existing factorization methods.

The authors tackled the challenge of high stochasticity in cooperative multi-agent reinforcement learning by proposing DFAC, a unified framework that integrates distributional RL with value function factorization, and demonstrated its effectiveness by outperforming baselines on StarCraft Multi-Agent Challenge maps and self-designed Ultra Hard maps.

In fully cooperative multi-agent reinforcement learning (MARL) settings, environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of other agents. To address the above issues, we proposed a unified framework, called DFAC, for integrating distributional RL with value function factorization methods. This framework generalizes expected value function factorization methods to enable the factorization of return distributions. To validate DFAC, we first demonstrate its ability to factorize the value functions of a simple matrix game with stochastic rewards. Then, we perform experiments on all Super Hard maps of the StarCraft Multi-Agent Challenge and six self-designed Ultra Hard maps, showing that DFAC is able to outperform a number of baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes