ROCVOct 12, 2023

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

AI2
arXiv:2310.08581v130 citationsh-index: 37
Originality Incremental advance
AI Analysis

This addresses the problem of automating task decomposition for robotic manipulation without task-specific knowledge, enabling better generalization to unseen tasks, though it builds incrementally on existing visual representation methods.

The paper tackles the challenge of learning long-horizon robotic manipulation tasks by proposing Universal Visual Decomposer (UVD), an off-the-shelf method that automatically decomposes tasks into subtasks using pre-trained visual representations, resulting in significantly improved compositional generalization and outperforming baselines in simulation and real-world evaluations.

Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks. Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks. To address these shortcomings, we propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long horizon manipulation using pre-trained visual representations designed for robotic control. At a high level, UVD discovers subgoals by detecting phase shifts in the embedding space of the pre-trained representation. Operating purely on visual demonstrations without auxiliary information, UVD can effectively extract visual subgoals embedded in the videos, while incurring zero additional training cost on top of standard visuomotor policy training. Goal-conditioned policies learned with UVD-discovered subgoals exhibit significantly improved compositional generalization at test time to unseen tasks. Furthermore, UVD-discovered subgoals can be used to construct goal-based reward shaping that jump-starts temporally extended exploration for reinforcement learning. We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings on in-domain and out-of-domain task sequences alike, validating the clear advantage of automated visual task decomposition within the simple, compact UVD framework.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes