LGROAug 9, 2021

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

arXiv:2108.03952v21 citations
Originality Incremental advance
AI Analysis

This work addresses safety-critical constraints for multi-agent systems in real-world applications, representing an incremental extension of single-agent safety methods to multi-agent settings.

The paper tackles the problem of ensuring safety in multi-agent deep reinforcement learning with continuous action spaces by enhancing the MADDPG framework with a safety layer and soft constraints, resulting in a dramatic decrease in constraint violations during learning.

Multi-agent control problems constitute an interesting area of application for deep reinforcement learning models with continuous action spaces. Such real-world applications, however, typically come with critical safety constraints that must not be violated. In order to ensure safety, we enhance the well-known multi-agent deep deterministic policy gradient (MADDPG) framework by adding a safety layer to the deep policy network. In particular, we extend the idea of linearizing the single-step transition dynamics, as was done for single-agent systems in Safe DDPG (Dalal et al., 2018), to multi-agent settings. We additionally propose to circumvent infeasibility problems in the action correction step using soft constraints (Kerrigan & Maciejowski, 2000). Results from the theory of exact penalty functions can be used to guarantee constraint satisfaction of the soft constraints under mild assumptions. We empirically find that the soft formulation achieves a dramatic decrease in constraint violations, making safety available even during the learning procedure.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes