ROCVLGOct 29, 2024

Local Policies Enable Zero-shot Long-horizon Manipulation

arXiv:2410.22332v233 citationsh-index: 13ICRA
Originality Highly original
AI Analysis

This addresses the problem of enabling robots to perform long-horizon manipulation tasks in varied real-world settings without task-specific training, representing a significant advance rather than an incremental improvement.

The paper tackles the challenge of sim2real transfer for robotic manipulation by introducing ManipGen, which uses local policies to achieve state-of-the-art zero-shot performance, with 97% success in simulation and outperforming existing methods by up to 76% on real-world tasks.

Sim2real for robotic manipulation is difficult due to the challenges of simulating complex contacts and generating realistic task distributions. To tackle the latter problem, we introduce ManipGen, which leverages a new class of policies for sim2real transfer: local policies. Locality enables a variety of appealing properties including invariances to absolute robot and object pose, skill ordering, and global scene configuration. We combine these policies with foundation models for vision, language and motion planning and demonstrate SOTA zero-shot performance of our method to Robosuite benchmark tasks in simulation (97%). We transfer our local policies from simulation to reality and observe they can solve unseen long-horizon manipulation tasks with up to 8 stages with significant pose, object and scene configuration variation. ManipGen outperforms SOTA approaches such as SayCan, OpenVLA, LLMTrajGen and VoxPoser across 50 real-world manipulation tasks by 36%, 76%, 62% and 60% respectively. Video results at https://mihdalal.github.io/manipgen/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes