ROAIMay 18, 2025

RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction

arXiv:2505.12224v324 citationsh-index: 2
Originality Highly original
AI Analysis

This addresses the issue of robotic failure recovery for VLA models in open-world scenarios, representing a novel method for a known bottleneck.

The paper tackles the problem of Vision-Language-Action models underperforming in open-world robotic manipulation due to limited failure recovery, by introducing the RoboFAC framework, which includes a dataset of erroneous trajectories and a model for failure analysis and correction, resulting in a 34.1% improvement over GPT-4o on their benchmark and a 29.1% average improvement in real-world tasks.

Vision-Language-Action (VLA) models have recently advanced robotic manipulation by translating natural-language instructions and image information into sequential control actions. However, these models often underperform in open-world scenarios, as they are predominantly trained on successful expert demonstrations and exhibit a limited capacity for failure recovery. In this work, we present a Robotic Failure Analysis and Correction (RoboFAC) framework to address this issue. Firstly, we construct RoboFAC dataset comprising 9,440 erroneous manipulation trajectories and 78,623 QA pairs across 16 diverse tasks and 53 scenes in both simulation and real-world environments. Leveraging our dataset, we develop RoboFAC model, which is capable of Task Understanding, Failure Analysis and Failure Correction. Experimental results demonstrate that the RoboFAC model outperforms GPT-4o by 34.1% on our evaluation benchmark. Furthermore, we integrate the RoboFAC model into a real-world VLA control pipeline as an external supervision providing correction instructions, yielding a 29.1% relative improvement on average on four real-world tasks. The results show that our RoboFAC framework effectively handles robotic failures and assists the VLA model in recovering from failures.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes