ROCVJul 31, 2020

Estimating Motion Codes from Demonstration Videos

arXiv:2007.15841v13 citations
AI Analysis

This work addresses the challenge of embedding robotic-relevant motion features from videos for use in robotics, though it appears incremental as it applies existing deep learning methods to a new representation.

The paper tackles the problem of representing manipulation actions from demonstration videos by developing a deep learning pipeline to extract motion codes, which encode mechanical features like contact and trajectory type, and shows that these codes can be extracted from actions in the EPIC-KITCHENS dataset.

A motion taxonomy can encode manipulations as a binary-encoded representation, which we refer to as motion codes. These motion codes innately represent a manipulation action in an embedded space that describes the motion's mechanical features, including contact and trajectory type. The key advantage of using motion codes for embedding is that motions can be more appropriately defined with robotic-relevant features, and their distances can be more reasonably measured using these motion features. In this paper, we develop a deep learning pipeline to extract motion codes from demonstration videos in an unsupervised manner so that knowledge from these videos can be properly represented and used for robots. Our evaluations show that motion codes can be extracted from demonstrations of action in the EPIC-KITCHENS dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes