LG AIDec 21, 2024

Subgoal Discovery Using a Free Energy Paradigm and State Aggregations

Amirhossein Mesbah, Reshad Hosseini, Seyed Pooya Shariatpanahi, Majid Nili Ahmadabadi

arXiv:2412.16687v24.61 citationsh-index: 27

Originality Incremental advance

AI Analysis

This addresses sample inefficiency and reward shaping problems in RL for navigation tasks, but it is incremental as it builds on existing hierarchical and goal-conditioned methods.

The paper tackled subgoal discovery in reinforcement learning by proposing a free energy paradigm that uses state unpredictability to identify subgoals, achieving robust performance in navigation tasks without prior knowledge.

Reinforcement learning (RL) plays a major role in solving complex sequential decision-making tasks. Hierarchical and goal-conditioned RL are promising methods for dealing with two major problems in RL, namely sample inefficiency and difficulties in reward shaping. These methods tackle the mentioned problems by decomposing a task into simpler subtasks and temporally abstracting a task in the action space. One of the key components for task decomposition of these methods is subgoal discovery. We can use the subgoal states to define hierarchies of actions and also use them in decomposing complex tasks. Under the assumption that subgoal states are more unpredictable, we propose a free energy paradigm to discover them. This is achieved by using free energy to select between two spaces, the main space and an aggregation space. The $model \; changes$ from neighboring states to a given state shows the unpredictability of a given state, and therefore it is used in this paper for subgoal discovery. Our empirical results on navigation tasks like grid-world environments show that our proposed method can be applied for subgoal discovery without prior knowledge of the task. Our proposed method is also robust to the stochasticity of environments.

View on arXiv PDF

Similar