LG AI FL SYApr 25, 2023

Fulfilling Formal Specifications ASAP by Model-free Reinforcement Learning

Mengyu Liu, Pengyuan Lu, Xin Chen, Fanxin Kong, Oleg Sokolsky, Insup Lee

arXiv:2304.12508v12.03 citationsh-index: 68

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient specification fulfillment in reinforcement learning, though it appears incremental as it builds on existing actor-critic methods.

The paper tackles the problem of encouraging agents to fulfill formal specifications as soon as possible using a model-free reinforcement learning framework, achieving success in finding sufficiently fast trajectories for up to 97% of test cases and outperforming baselines.

We propose a model-free reinforcement learning solution, namely the ASAP-Phi framework, to encourage an agent to fulfill a formal specification ASAP. The framework leverages a piece-wise reward function that assigns quantitative semantic reward to traces not satisfying the specification, and a high constant reward to the remaining. Then, it trains an agent with an actor-critic-based algorithm, such as soft actor-critic (SAC), or deep deterministic policy gradient (DDPG). Moreover, we prove that ASAP-Phi produces policies that prioritize fulfilling a specification ASAP. Extensive experiments are run, including ablation studies, on state-of-the-art benchmarks. Results show that our framework succeeds in finding sufficiently fast trajectories for up to 97\% test cases and defeats baselines.

View on arXiv PDF

Similar