CLAILGApr 4, 2025

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

PrincetonSalesforceStanford
arXiv:2504.03601v4126 citationsh-index: 64Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of data scarcity for AI agent training, enabling more reliable and efficient agents, though it appears incremental as it builds on existing data generation and agent training methods.

The paper tackles the scarcity of high-quality data for training AI agents in multi-turn interactions by introducing APIGen-MT, a framework that generates verifiable and diverse multi-turn data through simulated agent-human interplay, resulting in models that outperform frontier models like GPT-4o and Claude 3.5 on benchmarks such as τ-bench and BFCL, with smaller models surpassing larger ones in multi-turn settings while maintaining superior consistency.

Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on $τ$-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source 5K synthetic data trajectories and the trained xLAM-2-fc-r models to advance research in AI agents. Models at https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4; Dataset at https://huggingface.co/datasets/Salesforce/APIGen-MT-5k and Website at https://apigen-mt.github.io

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes