AIMay 28

Indexing the Unreadable: LLM-Native Recursive Construction and Search of Service Taxonomies

Wei Zheng, Yang Yan, Yiyang Shao, Jinyang Li, Zeze Chang, Yukuang Jia, Qiming Mao, Chihyung Wang, Jingbin Zhou

arXiv:2605.2927021.7h-index: 3Has Code

Predicted impact top 38% in AI · last 90 daysOriginality Incremental advance

AI Analysis

For developers of LLM agent systems, this provides a scalable solution to service discovery without requiring larger context windows.

LLMs struggle with service discovery due to context window limits and the Lost-in-the-Middle phenomenon. A2X, an LLM-driven hierarchical taxonomy, achieves a 6.2-point Hit Rate gain at one-ninth the token cost over full-context dumping and improves Hit Rate by over 20 points compared to embedding-based baselines.

The era of the Internet of Agents (IoA) is taking shape: LLM agents are expected to fulfill user goals by orchestrating fast-growing populations of Model Context Protocol (MCP) servers, Agent-to-Agent (A2A) endpoints, reusable skills, and other LLM-callable services. Yet LLMs face a structural mismatch with this regime: effective context is a scarce resource that does not scale with the number of services. Concatenating thousands of service descriptions into a prompt overflows the context window, and even when the window is large enough, models systematically under-attend to information in the middle of long inputs, the well-documented Lost-in-the-Middle phenomenon. This is fundamentally a question of context management for service discovery. To address this, we propose an LLM-native progressive-disclosure scheme and its concrete instantiation, A2X (Agent-to-Anything service discovery): an LLM-driven pipeline that automatically organizes the registered services into a hierarchical taxonomy and walks it layer by layer at query time, so that every LLM call sees only a small candidate set highly relevant to the user query. This decouples effective-context scarcity from registry size and significantly reduces token consumption while improving retrieval accuracy. Compared to full-context dumping, A2X achieves a 6.2-point Hit Rate gain at one-ninth the prompt-token cost; compared to the state-of-the-art open-source embedding-based baseline, A2X improves Hit Rate by more than 20 points.

View on arXiv PDF

Similar