RO LGDec 13, 2024

Reward Machine Inference for Robotic Manipulation

Mattijs Baert, Sam Leroux, Pieter Simoens

arXiv:2412.10096v17.14 citationsh-index: 17

Originality Incremental advance

AI Analysis

This work addresses the challenge of automating reward specification in reinforcement learning for robotic manipulation, offering a method that reduces manual engineering but is incremental as it builds on existing RM and LfD techniques.

The paper tackles the problem of learning Reward Machines (RMs) for robotic manipulation tasks by introducing a novel Learning from Demonstrations (LfD) approach that infers RM structure and high-level events directly from visual demonstrations without predefined propositions or prior knowledge of sparse rewards, resulting in accurate task structure capture and effective policy learning by an RL agent.

Learning from Demonstrations (LfD) and Reinforcement Learning (RL) have enabled robot agents to accomplish complex tasks. Reward Machines (RMs) enhance RL's capability to train policies over extended time horizons by structuring high-level task information. In this work, we introduce a novel LfD approach for learning RMs directly from visual demonstrations of robotic manipulation tasks. Unlike previous methods, our approach requires no predefined propositions or prior knowledge of the underlying sparse reward signals. Instead, it jointly learns the RM structure and identifies key high-level events that drive transitions between RM states. We validate our method on vision-based manipulation tasks, showing that the inferred RM accurately captures task structure and enables an RL agent to effectively learn an optimal policy.

View on arXiv PDF

Similar