In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents
This addresses the challenge of compositional API usage for developers building LLM-based tool agents, representing a domain-specific advancement.
The paper tackles the problem of tool agents struggling with complex tasks requiring multiple API calls by introducing In-N-Out, an expert-annotated dataset of API graphs from real-world benchmarks. Using this dataset nearly doubles performance on tool retrieval and multi-tool query generation compared to LLMs using documentation alone, with models fine-tuned on it closing 90% of the gap.
Tool agents -- LLM-based systems that interact with external APIs -- offer a way to execute real-world tasks. However, as tasks become increasingly complex, these agents struggle to identify and call the correct APIs in the proper order. To tackle this problem, we investigate converting API documentation into a structured API graph that captures API dependencies and leveraging it for multi-tool queries that require compositional API calls. To support this, we introduce In-N-Out, the first expert-annotated dataset of API graphs built from two real-world API benchmarks and their documentation. Using In-N-Out significantly improves performance on both tool retrieval and multi-tool query generation, nearly doubling that of LLMs using documentation alone. Moreover, graphs generated by models fine-tuned on In-N-Out close 90% of this gap, showing that our dataset helps models learn to comprehend API documentation and parameter relationships. Our findings highlight the promise of using explicit API graphs for tool agents and the utility of In-N-Out as a valuable resource. We will release the dataset and code publicly.