LGAIMAMLOct 3, 2019

Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics

arXiv:1910.01465v2167 citations
AI Analysis

This addresses inefficiencies in policy learning for multi-agent domains, particularly in mixed cooperative-competitive tasks, with potential applications to high-dimensional robotic tasks.

The paper tackles the problem of value function overestimation bias in multi-agent reinforcement learning by proposing an approach using double centralized critics, showing a significant advantage over current methods on six mixed cooperative-competitive tasks.

Many real world tasks require multiple agents to work together. Multi-agent reinforcement learning (RL) methods have been proposed in recent years to solve these tasks, but current methods often fail to efficiently learn policies. We thus investigate the presence of a common weakness in single-agent RL, namely value function overestimation bias, in the multi-agent setting. Based on our findings, we propose an approach that reduces this bias by using double centralized critics. We evaluate it on six mixed cooperative-competitive tasks, showing a significant advantage over current methods. Finally, we investigate the application of multi-agent methods to high-dimensional robotic tasks and show that our approach can be used to learn decentralized policies in this domain.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes