AILGROJun 18, 2019

Learning to Plan Hierarchically from Curriculum

arXiv:1906.07371v18 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient planning in complex, stochastic environments for robotics and AI systems, representing an incremental improvement over existing methods.

The paper tackles the problem of hierarchical planning in domains with unknown dynamics by learning transition dynamics and abstract skills from a curriculum, resulting in superior planning performance in experiments with up to 2^100 states and real-world robotic applications.

We present a framework for learning to plan hierarchically in domains with unknown dynamics. We enhance planning performance by exploiting problem structure in several ways: (i) We simplify the search over plans by leveraging knowledge of skill objectives, (ii) Shorter plans are generated by enforcing aggressively hierarchical planning, (iii) We learn transition dynamics with sparse local models for better generalisation. Our framework decomposes transition dynamics into skill effects and success conditions, which allows fast planning by reasoning on effects, while learning conditions from interactions with the world. We propose a simple method for learning new abstract skills, using successful trajectories stemming from completing the goals of a curriculum. Learned skills are then refined to leverage other abstract skills and enhance subsequent planning. We show that both conditions and abstract skills can be learned simultaneously while planning, even in stochastic domains. Our method is validated in experiments of increasing complexity, with up to 2^100 states, showing superior planning to classic non-hierarchical planners or reinforcement learning methods. Applicability to real-world problems is demonstrated in a simulation-to-real transfer experiment on a robotic manipulator.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes