ROLGMar 3, 2021

Enabling Visual Action Planning for Object Manipulation through Latent Space Roadmap

arXiv:2103.02554v418 citations
AI Analysis

This addresses the problem of planning in high-dimensional state spaces for robotics manipulation, especially with deformable objects, but appears incremental as it builds on existing latent space and graph-based planning methods.

The paper tackles visual action planning for complex manipulation tasks, particularly with deformable objects, by introducing a Latent Space Roadmap (LSR) framework that maps observations to a low-dimensional latent space for planning, and demonstrates it on simulated tasks like box stacking and rope manipulation, as well as a real robot folding task.

We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces, focusing on manipulation of deformable objects. We propose a Latent Space Roadmap (LSR) for task planning which is a graph-based structure globally capturing the system dynamics in a low-dimensional latent space. Our framework consists of three parts: (1) a Mapping Module (MM) that maps observations given in the form of images into a structured latent space extracting the respective states as well as generates observations from the latent states, (2) the LSR which builds and connects clusters containing similar states in order to find the latent plans between start and goal states extracted by MM, and (3) the Action Proposal Module that complements the latent plan found by the LSR with the corresponding actions. We present a thorough investigation of our framework on simulated box stacking and rope/box manipulation tasks, and a folding task executed on a real robot.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes