ROAIFeb 28, 2022

Generalizable task representation learning from human demonstration videos: a geometric approach

arXiv:2202.13604v16 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient robot task learning from human videos for robotics applications, though it appears incremental as it builds on existing representation learning and visual servoing techniques.

The paper tackles the problem of learning generalizable task representations from human demonstration videos without additional robot training, by introducing a geometric approach that encodes task specifications to enable generalization across categorical objects, resulting in a method that transfers to robot control via uncalibrated visual servoing.

We study the problem of generalizable task learning from human demonstration videos without extra training on the robot or pre-recorded robot motions. Given a set of human demonstration videos showing a task with different objects/tools (categorical objects), we aim to learn a representation of visual observation that generalizes to categorical objects and enables efficient controller design. We propose to introduce a geometric task structure to the representation learning problem that geometrically encodes the task specification from human demonstration videos, and that enables generalization by building task specification correspondence between categorical objects. Specifically, we propose CoVGS-IL, which uses a graph-structured task function to learn task representations under structural constraints. Our method enables task generalization by selecting geometric features from different objects whose inner connection relationships define the same task in geometric constraints. The learned task representation is then transferred to a robot controller using uncalibrated visual servoing (UVS); thus, the need for extra robot training or pre-recorded robot motions is removed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes