Robust Satisfaction of Temporal Logic Specifications via Reinforcement Learning
This work addresses the challenge of ensuring robust task completion for autonomous systems under uncertainty, though it is incremental as it builds on existing reinforcement learning and temporal logic methods.
The paper tackles the problem of steering systems with unknown stochastic dynamics to satisfy complex temporal logic specifications, presenting reinforcement learning algorithms that maximize both satisfaction probability and robustness, with simulation results showing robustness maximization outperforms probability maximization in both metrics.
We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally layered task given as a signal temporal logic formula. We represent the system as a Markov decision process in which the states are built from a partition of the state space and the transition probabilities are unknown. We present provably convergent reinforcement learning algorithms to maximize the probability of satisfying a given formula and to maximize the average expected robustness, i.e., a measure of how strongly the formula is satisfied. We demonstrate via a pair of robot navigation simulation case studies that reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness.