CLApr 27, 2025

AndroidGen: Building an Android Language Agent under Data Scarcity

Hanyu Lai, Junjie Gao, Xiao Liu, Yifan Xu, Shudan Zhang, Yuxiao Dong, Jie Tang

arXiv:2504.19298v113.99 citationsh-index: 36Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses the problem of data scarcity for building LLM-based mobile agents, offering an incremental advancement by automating data collection without manual labeling.

The paper tackles the challenge of using large language models as agents on mobile devices under data scarcity by developing AndroidGen, a framework that collects trajectories from human tasks and trains open-source LLMs, resulting in demonstrated improvements on benchmarks like AndroidWorld and AitW.

Large language models have opened up a world of possibilities for various NLP tasks, sparking optimism for the future. Despite their potential, LLMs have yet to be widely used as agents on real mobile devices. The main challenge is the need for high-quality data sources. Time constraints and labor intensity often hinder human annotation. On the other hand, existing LLMs exhibit inadequate completion rates and need a robust data filtration strategy. Given these challenges, we develop a framework called AndroidGen to enhance the capabilities of LLM-based agents under data scarcity. In addition, we leverage AndroidGen to collect trajectories given human tasks and train open-source LLMs on these trajectories to develop an open-source mobile agent without manually labeled trajectories. We extensively evaluate AndroidGen with AndroidWorld, AitW, and various popular applications, demonstrating its improvements and revealing potential areas for future improvement. Code, model, and data are available at https://github.com/THUDM/AndroidGen.

View on arXiv PDF Code

Similar