AINov 30, 2024

DroidCall: A Dataset for LLM-powered Android Intent Invocation

arXiv:2412.00402v13 citationsh-index: 7Has CodeEMNLP
Originality Incremental advance
AI Analysis

This work addresses the need for on-device mobile agents with better data privacy by providing a dataset and pipeline for Android intent invocation, though it is incremental as it builds on existing agentic systems.

The authors tackled the problem of enabling small language models to accurately invoke Android intents from natural language instructions by introducing DroidCall, a dataset of 10k samples, and showed that fine-tuned models like Qwen2.5-3B and Gemma2-2B can approach or surpass GPT-4o's performance.

The growing capabilities of large language models in natural language understanding significantly strengthen existing agentic systems. To power performant on-device mobile agents for better data privacy, we introduce DroidCall, the first training and testing dataset for accurate Android intent invocation. With a highly flexible and reusable data generation pipeline, we constructed 10k samples in DroidCall. Given a task instruction in natural language, small language models such as Qwen2.5-3B and Gemma2-2B fine-tuned with DroidCall can approach or even surpass the capabilities of GPT-4o for accurate Android intent invocation. We also provide an end-to-end Android app equipped with these fine-tuned models to demonstrate the Android intent invocation process. The code and dataset are available at https://github.com/UbiquitousLearning/DroidCall.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes