Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines
This work addresses the challenge of continuous learning in AI agents, offering a domain-specific incremental improvement for reinforcement learning with logical specifications.
The paper tackles the problem of enabling lifelong reinforcement learning by leveraging previously learned knowledge to accelerate learning of logically specified tasks, resulting in LSRM outperforming methods that learn from scratch through task decomposition and knowledge transfer.
Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks. For the sake of more flexible specification of tasks, we first introduce Sequential Linear Temporal Logic (SLTL), which is a supplement to the existing Linear Temporal Logic (LTL) formal language. We then utilize Reward Machines (RM) to exploit structural reward functions for tasks encoded with high-level events, and propose automatic extension of RM and efficient knowledge transfer over tasks for continuous learning in lifetime. Experimental results show that LSRM outperforms the methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and knowledge transfer over RM during the lifelong learning process.