ROCLCVLGNov 26, 2019

Imitation Learning of Robot Policies by Combining Language, Vision and Demonstration

arXiv:1911.11744v13 citations
Originality Incremental advance
AI Analysis

This work addresses robot policy learning for end-users through verbal communication, but it is incremental as it builds on existing multimodal imitation learning approaches.

The paper tackles the problem of enabling robots to learn tasks by combining language, vision, and motion data, achieving a high task success rate in simulations across various conditions.

In this work we propose a novel end-to-end imitation learning approach which combines natural language, vision, and motion information to produce an abstract representation of a task, which in turn is used to synthesize specific motion controllers at run-time. This multimodal approach enables generalization to a wide variety of environmental conditions and allows an end-user to direct a robot policy through verbal communication. We empirically validate our approach with an extensive set of simulations and show that it achieves a high task success rate over a variety of conditions while remaining amenable to probabilistic interpretability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes