LGROOct 23, 2025

Safety Assessment in Reinforcement Learning via Model Predictive Control

arXiv:2510.20955v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses safety concerns in RL for control applications, but it is incremental as it builds on existing methods like model-predictive control and PPO.

The paper tackles the problem of ensuring safety in reinforcement learning by using model-predictive control to check actions for reversibility, preventing unsafe actions without needing explicit safety specifications. Experimental results show it successfully aborts all unsafe actions while achieving training progress comparable to a baseline that violates safety.

Model-free reinforcement learning approaches are promising for control but typically lack formal safety guarantees. Existing methods to shield or otherwise provide these guarantees often rely on detailed knowledge of the safety specifications. Instead, this work's insight is that many difficult-to-specify safety issues are best characterized by invariance. Accordingly, we propose to leverage reversibility as a method for preventing these safety issues throughout the training process. Our method uses model-predictive path integral control to check the safety of an action proposed by a learned policy throughout training. A key advantage of this approach is that it only requires the ability to query the black-box dynamics, not explicit knowledge of the dynamics or safety constraints. Experimental results demonstrate that the proposed algorithm successfully aborts before all unsafe actions, while still achieving comparable training progress to a baseline PPO approach that is allowed to violate safety.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes