LG ROAug 9, 2021

Safe Deep Reinforcement Learning for Multi-Agent Systems with Continuous Action Spaces

Ziyad Sheebaelhamd, Konstantinos Zisis, Athina Nisioti, Dimitris Gkouletsos, Dario Pavllo, Jonas Kohler

arXiv:2108.03952v23.11 citations

Originality Incremental advance

AI Analysis

This work addresses safety-critical constraints for multi-agent systems in real-world applications, representing an incremental extension of single-agent safety methods to multi-agent settings.

The paper tackles the problem of ensuring safety in multi-agent deep reinforcement learning with continuous action spaces by enhancing the MADDPG framework with a safety layer and soft constraints, resulting in a dramatic decrease in constraint violations during learning.

Multi-agent control problems constitute an interesting area of application for deep reinforcement learning models with continuous action spaces. Such real-world applications, however, typically come with critical safety constraints that must not be violated. In order to ensure safety, we enhance the well-known multi-agent deep deterministic policy gradient (MADDPG) framework by adding a safety layer to the deep policy network. In particular, we extend the idea of linearizing the single-step transition dynamics, as was done for single-agent systems in Safe DDPG (Dalal et al., 2018), to multi-agent settings. We additionally propose to circumvent infeasibility problems in the action correction step using soft constraints (Kerrigan & Maciejowski, 2000). Results from the theory of exact penalty functions can be used to guarantee constraint satisfaction of the soft constraints under mild assumptions. We empirically find that the soft formulation achieves a dramatic decrease in constraint violations, making safety available even during the learning procedure.

View on arXiv PDF

Similar