LG MLFeb 2, 2019

Certified Reinforcement Learning with Logic Guidance

Hosein Hasanbeig, Daniel Kroening, Alessandro Abate

arXiv:1902.00778v418.068 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for formal safety guarantees in RL for control problems, though it is incremental as it builds on existing LTL and automaton methods.

The paper tackled the problem of applying reinforcement learning in safety-critical domains by proposing a model-free RL algorithm that uses Linear Temporal Logic (LTL) to specify goals for unknown continuous-state/action MDPs, resulting in a guaranteed synthesis of control policies that satisfy LTL specifications with maximal probability.

Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs). The given LTL property is translated into a Limit-Deterministic Generalised Buchi Automaton (LDGBA), which is then used to shape a synchronous reward function on-the-fly. Under certain assumptions, the algorithm is guaranteed to synthesise a control policy whose traces satisfy the LTL specification with maximal probability.

View on arXiv PDF Code

Similar