AINov 30, 2024

DroidCall: A Dataset for LLM-powered Android Intent Invocation

Weikai Xie, Li Zhang, Shihe Wang, Rongjie Yi, Mengwei Xu

arXiv:2412.00402v17.33 citationsh-index: 7Has CodeEMNLP

Originality Incremental advance

AI Analysis

This work addresses the need for on-device mobile agents with better data privacy by providing a dataset and pipeline for Android intent invocation, though it is incremental as it builds on existing agentic systems.

The authors tackled the problem of enabling small language models to accurately invoke Android intents from natural language instructions by introducing DroidCall, a dataset of 10k samples, and showed that fine-tuned models like Qwen2.5-3B and Gemma2-2B can approach or surpass GPT-4o's performance.

The growing capabilities of large language models in natural language understanding significantly strengthen existing agentic systems. To power performant on-device mobile agents for better data privacy, we introduce DroidCall, the first training and testing dataset for accurate Android intent invocation. With a highly flexible and reusable data generation pipeline, we constructed 10k samples in DroidCall. Given a task instruction in natural language, small language models such as Qwen2.5-3B and Gemma2-2B fine-tuned with DroidCall can approach or even surpass the capabilities of GPT-4o for accurate Android intent invocation. We also provide an end-to-end Android app equipped with these fine-tuned models to demonstrate the Android intent invocation process. The code and dataset are available at https://github.com/UbiquitousLearning/DroidCall.

View on arXiv PDF Code

Similar