Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty
This work addresses safety-critical real-world applications like robotics and autonomous systems by providing an interpretable and robust method for safe RL under uncertainty, though it appears incremental as it builds on existing safe RL baselines.
The paper tackles the challenge of safe reinforcement learning under multiple uncertainty sources by proposing Fuz-RL, a fuzzy-guided robust framework that uses a novel fuzzy Bellman operator with Choquet integrals, and demonstrates significant improvements in safety and control performance on benchmark scenarios.
Safe Reinforcement Learning (RL) is crucial for achieving high performance while ensuring safety in real-world applications. However, the complex interplay of multiple uncertainty sources in real environments poses significant challenges for interpretable risk assessment and robust decision-making. To address these challenges, we propose Fuz-RL, a fuzzy measure-guided robust framework for safe RL. Specifically, our framework develops a novel fuzzy Bellman operator for estimating robust value functions using Choquet integrals. Theoretically, we prove that solving the Fuz-RL problem (in Constrained Markov Decision Process (CMDP) form) is equivalent to solving distributionally robust safe RL problems (in robust CMDP form), effectively avoiding min-max optimization. Empirical analyses on safe-control-gym and safety-gymnasium scenarios demonstrate that Fuz-RL effectively integrates with existing safe RL baselines in a model-free manner, significantly improving both safety and control performance under various types of uncertainties in observation, action, and dynamics.