LGAIMar 19, 2021

Learning Task Decomposition with Ordered Memory Policy Network

arXiv:2103.10972v117 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of task decomposition in AI, enabling better learning and generalization in hierarchical tasks, though it appears incremental as it builds on existing methods for hierarchical learning.

The paper tackles the problem of discovering subtask hierarchies from demonstrations for complex tasks, proposing the Ordered Memory Policy Network (OMPN) to achieve higher task decomposition performance compared to baselines in experiments on Craft and Dial domains.

Many complex real-world tasks are composed of several levels of sub-tasks. Humans leverage these hierarchical structures to accelerate the learning process and achieve better generalization. In this work, we study the inductive bias and propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration. The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstruc-tured demonstration. Experiments on Craft and Dial demonstrate that our modelcan achieve higher task decomposition performance under both unsupervised and weakly supervised settings, comparing with strong baselines. OMPN can also bedirectly applied to partially observable environments and still achieve higher task decomposition performance. Our visualization further confirms that the subtask hierarchy can emerge in our model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes