RONov 8, 2019

Visual Geometric Skill Inference by Watching Human Demonstration

arXiv:1911.04418v28 citations
AI Analysis

This addresses the challenge of tedious feature selection and robust tracking in robotics for human-readable task definition and control, though it is incremental as it builds on existing inverse reinforcement learning methods.

The paper tackles the problem of learning manipulation skills from human demonstration videos by inferring geometric feature associations, achieving correct geometric associations with only one demonstration and good generalization under variance.

We study the problem of learning manipulation skills from human demonstration video by inferring the association relationships between geometric features. Motivation for this work stems from the observation that humans perform eye-hand coordination tasks by using geometric primitives to define a task while a geometric control error drives the task through execution. We propose a graph based kernel regression method to directly infer the underlying association constraints from human demonstration video using Incremental Maximum Entropy Inverse Reinforcement Learning (InMaxEnt IRL). The learned skill inference provides human readable task definition and outputs control errors that can be directly plugged into traditional controllers. Our method removes the need for tedious feature selection and robust feature trackers required in traditional approaches (e.g. feature-based visual servoing). Experiments show our method infers correct geometric associations even with only one human demonstration video and can generalize well under variance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes