MAAIFeb 16, 2020

R-MADDPG for Partially Observable Environments and Limited Communication

arXiv:2002.06684v296 citations
AI Analysis

This addresses coordination challenges for multiagent systems like self-driving cars, but it is incremental as it builds on existing MARL methods with recurrency for specific bottlenecks.

The paper tackles multiagent coordination in partially observable environments with limited communication by introducing R-MADDPG, a deep recurrent actor-critic framework, which learns time dependencies to share missing observations and develop communication patterns, improving performance and resource use.

There are several real-world tasks that would benefit from applying multiagent reinforcement learning (MARL) algorithms, including the coordination among self-driving cars. The real world has challenging conditions for multiagent learning systems, such as its partial observable and nonstationary nature. Moreover, if agents must share a limited resource (e.g. network bandwidth) they must all learn how to coordinate resource use. This paper introduces a deep recurrent multiagent actor-critic framework (R-MADDPG) for handling multiagent coordination under partial observable set-tings and limited communication. We investigate recurrency effects on performance and communication use of a team of agents. We demonstrate that the resulting framework learns time dependencies for sharing missing observations, handling resource limitations, and developing different communication patterns among agents.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes