ROCVSYMay 13, 2025

Multi-step manipulation task and motion planning guided by video demonstration

arXiv:2505.08949v1h-index: 6
Originality Incremental advance
AI Analysis

It addresses the problem of enabling robots to perform intricate manipulation tasks by learning from videos, which is incremental as it builds on existing RRT methods.

This work tackles complex multi-step task-and-motion planning in robotics by leveraging instructional videos, proposing an extension of the RRT planner that combines video-extracted contact states and 3D poses to solve tasks with sequential dependencies, and demonstrates effectiveness on robots like the Franka Emika Panda and KUKA KMR iiwa with a new benchmark and trajectory refinement approach.

This work aims to leverage instructional video to solve complex multi-step task-and-motion planning tasks in robotics. Towards this goal, we propose an extension of the well-established Rapidly-Exploring Random Tree (RRT) planner, which simultaneously grows multiple trees around grasp and release states extracted from the guiding video. Our key novelty lies in combining contact states and 3D object poses extracted from the guiding video with a traditional planning algorithm that allows us to solve tasks with sequential dependencies, for example, if an object needs to be placed at a specific location to be grasped later. We also investigate the generalization capabilities of our approach to go beyond the scene depicted in the instructional video. To demonstrate the benefits of the proposed video-guided planning approach, we design a new benchmark with three challenging tasks: (I) 3D re-arrangement of multiple objects between a table and a shelf, (ii) multi-step transfer of an object through a tunnel, and (iii) transferring objects using a tray similar to a waiter transfers dishes. We demonstrate the effectiveness of our planning algorithm on several robots, including the Franka Emika Panda and the KUKA KMR iiwa. For a seamless transfer of the obtained plans to the real robot, we develop a trajectory refinement approach formulated as an optimal control problem (OCP).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes