AIDec 8, 2016

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Andrew M. Saxe, Adam Earle, Benjamin Rosman

arXiv:1612.02757v14.51 citations

Originality Highly original

AI Analysis

This addresses the problem of scalability in reinforcement learning for AI systems, offering a novel alternative to serial execution methods.

The paper tackles the scalability of reinforcement learning by proposing a hierarchical architecture based on concurrent execution of actions, using linearly solvable Markov decision processes to enable parallel composition of macro-actions, resulting in a framework that supports deep hierarchies abstracted in space and time.

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.

View on arXiv PDF

Similar