HC CLMay 28, 2025

UI-Evol: Automatic Knowledge Evolving for Computer Use Agents

Ziyun Zhang, Xinyi Liu, Xiaoyi Zhang, Jun Wang, Gang Chen, Yan Lu

arXiv:2505.21964v213.14 citationsh-index: 4

Originality Highly original

AI Analysis

It addresses a critical reliability issue in computer use agents for real-world task execution, representing a novel method for a known bottleneck.

The paper tackles the knowledge-execution gap in computer use agents, where even 90% correct knowledge yields only 41% execution success, and proposes UI-Evol, a plug-and-play module that boosts task performance and reduces behavioral standard deviation.

External knowledge has played a crucial role in the recent development of computer use agents. We identify a critical knowledge-execution gap: retrieved knowledge often fails to translate into effective real-world task execution. Our analysis shows even 90% correct knowledge yields only 41% execution success rate. To bridge this gap, we propose UI-Evol, a plug-and-play module for autonomous GUI knowledge evolution. UI-Evol consists of two stages: a Retrace Stage that extracts faithful objective action sequences from actual agent-environment interactions, and a Critique Stage that refines existing knowledge by comparing these sequences against external references. We conduct comprehensive experiments on the OSWorld benchmark with the state-of-the-art Agent S2. Our results demonstrate that UI-Evol not only significantly boosts task performance but also addresses a previously overlooked issue of high behavioral standard deviation in computer use agents, leading to superior performance on computer use tasks and substantially improved agent reliability.

View on arXiv PDF

Similar