Semantic Tool Discovery for Large Language Models: A Vector-Based Approach to MCP Tool Selection
This addresses efficiency and cost issues for developers and users of LLMs with tool-calling capabilities, though it is incremental as it builds on existing MCP frameworks.
The paper tackles the scalability challenge of providing all available tools to LLMs in the Model Context Protocol (MCP) framework, which causes high token overhead and reduced accuracy, by introducing a semantic tool discovery architecture that uses vector-based retrieval to dynamically select relevant tools, resulting in a 99.6% reduction in token consumption and a 97.1% hit rate.
Large Language Models (LLMs) with tool-calling capabilities have demonstrated remarkable potential in executing complex tasks through external tool integration. The Model Context Protocol (MCP) has emerged as a standardized framework for connecting LLMs to diverse toolsets, with individual MCP servers potentially exposing dozens to hundreds of tools. However, current implementations face a critical scalability challenge: providing all available tools to the LLM context results in substantial token overhead, increased costs, reduced accuracy, and context window constraints. We present a semantic tool discovery architecture that addresses these challenges through vector-based retrieval. Our approach indexes MCP tools using dense embeddings that capture semantic relationships between tool capabilities and user intent, dynamically selecting only the most relevant tools (typically 3-5) rather than exposing the entire tool catalog (50-100+). Experimental results demonstrate a 99.6% reduction in tool-related token consumption with a hit rate of 97.1% at K=3 and an MRR of 0.91 on a benchmark of 140 queries across 121 tools from 5 MCP servers, with sub-100ms retrieval latency. Contributions include: (1) a semantic indexing framework for MCP tools, (2) a dynamic tool selection algorithm based on query-tool similarity, (3) comprehensive evaluation demonstrating significant efficiency and accuracy improvements, and (4) extensibility to multi-agent and cross-organizational tool discovery.