AIFeb 9

SCOUT-RAG: Scalable and Cost-Efficient Unifying Traversal for Agentic Graph-RAG over Distributed Domains

Longkun Li, Yuanben Zou, Jinghan Wu, Yuqing Wen, Jing Li, Hangwei Qian, Ivor Tsang

arXiv:2602.08400v12.41 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient retrieval in distributed, access-restricted settings like hospitals or multinational organizations, though it appears incremental as it builds on existing Graph-RAG methods.

The paper tackles the problem of retrieving information from distributed knowledge graphs without global visibility, introducing SCOUT-RAG, a framework that uses cooperative agents to perform progressive cross-domain retrieval. It achieves performance similar to centralized baselines while reducing cross-domain calls, tokens processed, and latency.

Graph-RAG improves LLM reasoning using structured knowledge, yet conventional designs rely on a centralized knowledge graph. In distributed and access-restricted settings (e.g., hospitals or multinational organizations), retrieval must select relevant domains and appropriate traversal depth without global graph visibility or exhaustive querying. To address this challenge, we introduce \textbf{SCOUT-RAG} (\textit{\underline{S}calable and \underline{CO}st-efficient \underline{U}nifying \underline{T}raversal}), a distributed agentic Graph-RAG framework that performs progressive cross-domain retrieval guided by incremental utility goals. SCOUT-RAG employs four cooperative agents that: (i) estimate domain relevance, (ii) decide when to expand retrieval to additional domains, (iii) adapt traversal depth to avoid unnecessary graph exploration, and (iv) synthesize the high-quality answers. The framework is designed to minimize retrieval regret, defined as missing useful domain information, while controlling latency and API cost. Across multi-domain knowledge settings, SCOUT-RAG achieves performance comparable to centralized baselines, including DRIFT and exhaustive domain traversal, while substantially reducing cross-domain calls, total tokens processed, and latency.

View on arXiv PDF

Similar