CLAIMay 27, 2025

BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism

arXiv:2505.20660v115 citationsh-index: 10EMNLP
Originality Incremental advance
AI Analysis

This addresses the issue of error handling in GUI agents for automation tasks, representing an incremental improvement over existing methods.

The paper tackles the problem of GUI agents lacking error detection and recovery mechanisms by proposing BacktrackAgent, a framework with backtracking that improves task success rate and step accuracy on Mobile3M and Auto-UI benchmarks.

Graphical User Interface (GUI) agents have gained substantial attention due to their impressive capabilities to complete tasks through multiple interactions within GUI environments. However, existing agents primarily focus on enhancing the accuracy of individual actions and often lack effective mechanisms for detecting and recovering from errors. To address these shortcomings, we propose the BacktrackAgent, a robust framework that incorporates a backtracking mechanism to improve task completion efficiency. BacktrackAgent includes verifier, judger, and reflector components as modules for error detection and recovery, while also applying judgment rewards to further enhance the agent's performance. Additionally, we develop a training dataset specifically designed for the backtracking mechanism, which considers the outcome pages after action executions. Experimental results show that BacktrackAgent has achieved performance improvements in both task success rate and step accuracy on Mobile3M and Auto-UI benchmarks. Our data and code will be released upon acceptance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes