ROLGMar 19, 2020

Latent Space Roadmap for Visual Action Planning of Deformable and Rigid Object Manipulation

arXiv:2003.08974v162 citations
AI Analysis

This addresses the challenge of planning in high-dimensional visual spaces for robotics, particularly for deformable objects, but is incremental as it builds on existing latent space and graph-based methods.

The paper tackles the problem of visual action planning for complex manipulation tasks with high-dimensional state spaces, such as deformable object manipulation, by performing planning in a low-dimensional latent space and demonstrating effectiveness on simulated box stacking and real-robot T-shirt folding tasks.

We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces such as manipulation of deformable objects. Planning is performed in a low-dimensional latent state space that embeds images. We define and implement a Latent Space Roadmap (LSR) which is a graph-based structure that globally captures the latent system dynamics. Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them. We show the effectiveness of the method on a simulated box stacking task as well as a T-shirt folding task performed with a real robot.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes