Shields to Guarantee Probabilistic Safety in MDPs
For developers of autonomous systems requiring probabilistic safety guarantees, this work provides a formal framework and practical shield constructions, though it is an incremental extension of existing shielding methods.
This paper extends classical shielding to probabilistic safety in MDPs, showing that strong guarantees on safety and permissiveness cannot be preserved, and introduces new shield constructions with strong safety guarantees that are computationally feasible.
Shielding is a prominent model-based technique to ensure safety of autonomous agents. Classical shielding aims to ensure that nothing bad ever happens and comes with strong guarantees about safety and maximal permissiveness. However, shielding systems for probabilistic safety, where something bad is allowed to happen with an acceptable probability, has proven to be more intricate. This paper presents a formal framework that conservatively extends classical shields to probabilistic safety. In this framework, we (i) demonstrate the impossibility of preserving the strong guarantees on safety and permissiveness, (ii) provide natural shields with weaker guarantees, and (iii) introduce offline and online shield constructions ensuring strong safety guarantees. The empirical evaluation highlights the practical advantages of the new shields, as well as their computational feasibility.