LGAIOct 15, 2024

Unveiling Options with Neural Decomposition

arXiv:2410.11262v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the generalization challenge in reinforcement learning for agents, but it is incremental as it builds on existing methods for policy decomposition.

The paper tackles the problem of agents in reinforcement learning lacking generalization to related tasks by decomposing neural network policies into reusable sub-policies as options, and empirical results in grid-world domains show it accelerates learning on similar tasks.

In reinforcement learning, agents often learn policies for specific tasks without the ability to generalize this knowledge to related tasks. This paper introduces an algorithm that attempts to address this limitation by decomposing neural networks encoding policies for Markov Decision Processes into reusable sub-policies, which are used to synthesize temporally extended actions, or options. We consider neural networks with piecewise linear activation functions, so that they can be mapped to an equivalent tree that is similar to oblique decision trees. Since each node in such a tree serves as a function of the input of the tree, each sub-tree is a sub-policy of the main policy. We turn each of these sub-policies into options by wrapping it with while-loops of varied number of iterations. Given the large number of options, we propose a selection mechanism based on minimizing the Levin loss for a uniform policy on these options. Empirical results in two grid-world domains where exploration can be difficult confirm that our method can identify useful options, thereby accelerating the learning process on similar but different tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes