LGAICLJul 29, 2019

Semantic RL with Action Grammars: Data-Efficient Learning of Hierarchical Task Abstractions

arXiv:1907.12477v25 citations
AI Analysis

This addresses the need for more data-efficient and interpretable hierarchical task abstractions in reinforcement learning, though it appears incremental as it builds on existing grammar induction ideas.

The paper tackles the problem of hierarchical reinforcement learning requiring manual sub-task specification or lacking interpretability by proposing a cognitive-inspired architecture that uses grammar induction to identify sub-goal policies, resulting in action grammars that unify symbolic and connectionist approaches and facilitate efficient imitation, transfer, and online learning.

Hierarchical Reinforcement Learning algorithms have successfully been applied to temporal credit assignment problems with sparse reward signals. However, state-of-the-art algorithms require manual specification of sub-task structures, a sample inefficient exploration phase or lack semantic interpretability. Humans, on the other hand, efficiently detect hierarchical sub-structures induced by their surroundings. It has been argued that this inference process universally applies to language, logical reasoning as well as motor control. Therefore, we propose a cognitive-inspired Reinforcement Learning architecture which uses grammar induction to identify sub-goal policies. By treating an on-policy trajectory as a sentence sampled from the policy-conditioned language of the environment, we identify hierarchical constituents with the help of unsupervised grammatical inference. The resulting set of temporal abstractions is called action grammar (Pastra & Aloimonos, 2012) and unifies symbolic and connectionist approaches to Reinforcement Learning. It can be used to facilitate efficient imitation, transfer and online learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes