LGAIROSYNov 26, 2025

Predictive Safety Shield for Dyna-Q Reinforcement Learning

arXiv:2511.21531v1ECC
Originality Incremental advance
AI Analysis

This work addresses safety guarantees for reinforcement learning in real-world tasks, though it appears incremental as it builds on existing safety shields by incorporating predictive elements.

The authors tackled the challenge of ensuring safety in reinforcement learning by proposing a predictive safety shield for model-based agents in discrete spaces, which improved performance while maintaining hard safety guarantees, with experiments showing that short prediction horizons could identify optimal paths and robustness to distribution shifts.

Obtaining safety guarantees for reinforcement learning is a major challenge to achieve applicability for real-world tasks. Safety shields extend standard reinforcement learning and achieve hard safety guarantees. However, existing safety shields commonly use random sampling of safe actions or a fixed fallback controller, therefore disregarding future performance implications of different safe actions. In this work, we propose a predictive safety shield for model-based reinforcement learning agents in discrete space. Our safety shield updates the Q-function locally based on safe predictions, which originate from a safe simulation of the environment model. This shielding approach improves performance while maintaining hard safety guarantees. Our experiments on gridworld environments demonstrate that even short prediction horizons can be sufficient to identify the optimal path. We observe that our approach is robust to distribution shifts, e.g., between simulation and reality, without requiring additional training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes