AIMay 2

Faithful Mobile GUI Agents with Guided Advantage Estimator

Haowen Hu, Pengzhou Cheng, Zheng Wu, Lingzhong Dong, Gongshen Liu, Zhuosheng Zhang

arXiv:2605.0120891.2h-index: 12

AI Analysis

For GUI agent developers, this addresses the critical problem of unfaithful behavior (relying on shortcuts) by introducing a faithfulness-first framework with concrete improvements on a specific metric.

Faithful-Agent reformulates GUI agent interaction to prioritize evidence groundedness and internal consistency, using a two-stage pipeline with faithfulness-oriented SFT and RFT with a guided advantage estimator. It improves Trap Success Rate from 13.88% to 80.21% while maintaining general instruction-following performance.

Vision-language model based graphical user interface (GUI) agents have shown strong interaction capabilities. However, they often behave unfaithfully, relying on memorized shortcuts rather than grounding actions in displayed screen evidence or user instructions. To address this, we propose Faithful-Agent, a faithfulness-first framework that reformulates GUI interaction to prioritize evidence groundedness and internal consistency. Faithful-Agent employs a two-stage pipeline: (i) a faithfulness-oriented SFT stage to instill abstainment behaviors under evidence perturbations; (ii) an RFT stage that further amplifies faithfulness by introducing the guided advantage estimator (GuAE), an anchor-based and variance-adaptive advantage tempering mechanism built upon GRPO. GuAE prevents advantage collapse in low-variance rollout groups under sparse GUI rewards, and with a thought-action consistency reward, Faithful-Agent (Stage II) elevates the Trap SR from 13.88\% to 80.21\% relative to the baseline, while preserving robust general instruction-following performance.

View on arXiv PDF

Similar