LGAIFeb 3, 2024

A Survey of Constraint Formulations in Safe Reinforcement Learning

arXiv:2402.02025v260 citationsh-index: 18IJCAI
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that organizes existing knowledge for researchers in safe RL.

The paper addresses the lack of systematic understanding in safe reinforcement learning by providing a comprehensive review of constraint formulations and their interrelations, along with curated algorithms and theoretical underpinnings.

Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent's policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes