Safe Reinforcement Learning via Confidence-Based Filters
This addresses safety challenges in deploying RL to real-world systems, representing an incremental improvement by integrating control-theoretic methods with standard RL techniques.
The paper tackled the problem of ensuring safety in reinforcement learning (RL) for real-world systems by developing confidence-based safety filters that certify state safety constraints for nominal policies, with formal safety guarantees and empirical effectiveness demonstrated.
Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.