ROAINov 9, 2021

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

arXiv:2111.05318v1Has Code
Originality Highly original
AI Analysis

This work addresses the challenge of scaling learning for contact-rich manipulation in robotics, which is incremental as it integrates model-based and deep learning approaches.

The paper tackles the problem of visual non-prehensile planar manipulation by proposing a novel architecture that combines video decoding with contact mechanics priors, achieving better performance than learning-only methods on unseen objects and motions.

Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills. However, reasoning over mechanics and dexterous interactions can make it challenging to scale learning contact-rich manipulation. In this work, we focus on the problem of visual non-prehensile planar manipulation: given a video of an object in planar motion, find contact-aware robot actions that reproduce the same object motion. We propose a novel architecture, Differentiable Learning for Manipulation (\ours), that combines video decoding neural models with priors from contact mechanics by leveraging differentiable optimization and finite difference based simulation. Through extensive simulated experiments, we investigate the interplay between traditional model-based techniques and modern deep learning approaches. We find that our modular and fully differentiable architecture performs better than learning-only methods on unseen objects and motions. \url{https://github.com/baceituno/dlm}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes