RO AINov 15, 2024

VeriGraph: Scene Graphs for Execution Verifiable Robot Planning

Daniel Ekpo, Mara Levy, Saksham Suri, Chuong Huynh, Abhinav Shrivastava

arXiv:2411.10446v29.413 citationsh-index: 9

Originality Incremental advance

AI Analysis

This addresses the challenge of unreliable robot planning for manipulation tasks, though it is incremental as it builds on existing vision-language models and scene graph methods.

The paper tackles the problem of incorrect action sequences generated by vision-language models for robot task planning by proposing VeriGraph, a framework that uses scene graphs to verify and refine plans, resulting in a 58% improvement for language-based tasks and 30% for image-based tasks over baselines.

Recent advancements in vision-language models (VLMs) offer potential for robot task planning, but challenges remain due to VLMs' tendency to generate incorrect action sequences. To address these limitations, we propose VeriGraph, a novel framework that integrates VLMs for robotic planning while verifying action feasibility. VeriGraph employs scene graphs as an intermediate representation, capturing key objects and spatial relationships to improve plan verification and refinement. The system generates a scene graph from input images and uses it to iteratively check and correct action sequences generated by an LLM-based task planner, ensuring constraints are respected and actions are executable. Our approach significantly enhances task completion rates across diverse manipulation scenarios, outperforming baseline methods by 58% for language-based tasks and 30% for image-based tasks.

View on arXiv PDF

Similar