LG AIMar 3

Reinforcement Learning with Symbolic Reward Machines

arXiv:2603.03068v11.4h-index: 21

Originality Highly original

AI Analysis

This work addresses the problem of limited applicability of Reward Machines in widely adopted RL frameworks for researchers and practitioners working with Reinforcement Learning, providing an incremental solution.

The authors tackled the limitation of manual user input required for Reward Machines in Reinforcement Learning, and their proposed Symbolic Reward Machines (SRMs) achieved comparable results to existing RM methods while outperforming baseline RL approaches. Their SRM methods generated the same results as existing RM methods.

Reward Machines (RMs) are an established mechanism in Reinforcement Learning (RL) to represent and learn sparse, temporally extended tasks with non-Markovian rewards. RMs rely on high-level information in the form of labels that are emitted by the environment alongside the observation. However, this concept requires manual user input for each environment and task. The user has to create a suitable labeling function that computes the labels. These limitations lead to poor applicability in widely adopted RL frameworks. We propose Symbolic Reward Machines (SRMs) together with the learning algorithms QSRM and LSRM to overcome the limitations of RMs. SRMs consume only the standard output of the environment and process the observation directly through guards that are represented by symbolic formulas. In our evaluation, our SRM methods outperform the baseline RL approaches and generate the same results as the existing RM methods. At the same time, our methods adhere to the widely used environment definition and provide interpretable representations of the task to the user.

View on arXiv PDF

Similar