IR AIMay 22

FD-RAG: Federated Dual-System Retrieval-Augmented Generation

arXiv:2605.2743260.0h-index: 2

AI Analysis

It addresses the problem of deploying RAG in decentralized edge settings with fragmented knowledge and limited computation, offering a practical solution for privacy-preserving, low-latency QA.

FD-RAG proposes a federated dual-system RAG framework for edge environments that decouples lightweight memory access from LLM reasoning, improving accuracy by up to 7.8% and reducing latency by 8.4x on QA benchmarks.

Retrieval-augmented generation (RAG) has emerged as a paradigm for grounding large language models in external knowledge, yet most existing RAG systems assume centralized knowledge access and ample computation. These assumptions break down in edge environments, where knowledge is fragmented across devices, raw data cannot be shared, and repeated LLM calls are prohibitively expensive. We propose FD-RAG, a federated dual-system RAG framework that decouples lightweight memory access from on-demand LLM reasoning for decentralized deployment. Specifically, FD-RAG learns semantic-aware adaptive hypergraphs over local corpora and distills them into compact QA memories. At inference time, it answers well-covered queries via direct memory matching and invokes LLM-based reasoning only when necessary, while tracing retrieved memories to hypergraph-grounded evidence. To mitigate cross-device knowledge fragmentation, FD-RAG aggregates anonymized memories across devices without exposing raw documents. Experiments on QA benchmarks show that FD-RAG improves accuracy by up to 7.8\% while reducing latency by 8.4$\times$ compared with strong local and federated baselines. We also provide theoretical analysis establishing an $\mathcal{O}(1/ε^{2})$ convergence rate for the proposed hypergraph learning, supporting its tractable deployment in edge settings.

View on arXiv PDF

Similar