Funnel-based Reward Shaping for Signal Temporal Logic Tasks in Reinforcement Learning
This work addresses a specific problem in reinforcement learning for formal verification, offering a tractable solution for robust STL satisfaction in continuous domains, which is incremental relative to prior methods.
The paper tackled the challenge of ensuring robust satisfaction of Signal Temporal Logic (STL) specifications in continuous state spaces while maintaining tractability, and proposed a funnel-based reinforcement learning algorithm that demonstrated utility on several STL tasks across different environments.
Signal Temporal Logic (STL) is a powerful framework for describing the complex temporal and logical behaviour of the dynamical system. Numerous studies have attempted to employ reinforcement learning to learn a controller that enforces STL specifications; however, they have been unable to effectively tackle the challenges of ensuring robust satisfaction in continuous state space and maintaining tractability. In this paper, leveraging the concept of funnel functions, we propose a tractable reinforcement learning algorithm to learn a time-dependent policy for robust satisfaction of STL specification in continuous state space. We demonstrate the utility of our approach on several STL tasks using different environments.