RO LGSep 29, 2018

Auto-conditioned Recurrent Mixture Density Networks for Learning Generalizable Robot Skills

Hejia Zhang, Eric Heiden, Stefanos Nikolaidis, Joseph J. Lim, Gaurav S. Sukhatme

arXiv:1810.00146v310.910 citations

Originality Incremental advance

AI Analysis

This enables personal robots to perform complex manipulation tasks from high-level instructions without hand-engineering, addressing a bottleneck in learning from demonstration for non-technical users.

The paper tackles the challenge of generalizing robot skills to unseen tasks from few demonstrations by introducing a state transition model that generates joint-space trajectories, showing in real robot experiments that it can synthesize motions with longer time horizons than expert trajectories.

Personal robots assisting humans must perform complex manipulation tasks that are typically difficult to specify in traditional motion planning pipelines, where multiple objectives must be met and the high-level context be taken into consideration. Learning from demonstration (LfD) provides a promising way to learn these kind of complex manipulation skills even from non-technical users. However, it is challenging for existing LfD methods to efficiently learn skills that can generalize to task specifications that are not covered by demonstrations. In this paper, we introduce a state transition model (STM) that generates joint-space trajectories by imitating motions from expert behavior. Given a few demonstrations, we show in real robot experiments that the learned STM can quickly generalize to unseen tasks and synthesize motions having longer time horizons than the expert trajectories. Compared to conventional motion planners, our approach enables the robot to accomplish complex behaviors from high-level instructions without laborious hand-engineering of planning objectives, while being able to adapt to changing goals during the skill execution. In conjunction with a trajectory optimizer, our STM can construct a high-quality skeleton of a trajectory that can be further improved in smoothness and precision. In combination with a learned inverse dynamics model, we additionally present results where the STM is used as a high-level planner. A video of our experiments is available at https://youtu.be/85DX9Ojq-90

View on arXiv PDF

Similar