CLFeb 5

FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters

arXiv:2602.05235v1h-index: 4
Originality Highly original
AI Analysis

This addresses the challenge of using RAG in privacy-aware settings where knowledge is distributed across silos, offering a novel federated approach with significant performance and efficiency gains.

The paper tackles the problem of deploying Retrieval-Augmented Generation (RAG) in privacy-sensitive domains by proposing FedMosaic, a federated RAG framework that avoids sharing raw documents. It achieves an average 10.9% higher accuracy than state-of-the-art methods while reducing storage costs by 78.8% to 86.3% and communication costs by 91.4%.

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge to improve factuality and reduce hallucinations. Yet most deployments assume a centralized corpus, which is infeasible in privacy aware domains where knowledge remains siloed. This motivates federated RAG (FedRAG), where a central LLM server collaborates with distributed silos without sharing raw documents. In context RAG violates this requirement by transmitting verbatim documents, whereas parametric RAG encodes documents into lightweight adapters that merge with a frozen LLM at inference, avoiding raw-text exchange. We adopt the parametric approach but face two unique challenges induced by FedRAG: high storage and communication from per-document adapters, and destructive aggregation caused by indiscriminately merging multiple adapters. We present FedMosaic, the first federated RAG framework built on parametric adapters. FedMosaic clusters semantically related documents into multi-document adapters with document-specific masks to reduce overhead while preserving specificity, and performs selective adapter aggregation to combine only relevance-aligned, nonconflicting adapters. Experiments show that FedMosaic achieves an average 10.9% higher accuracy than state-of-the-art methods in four categories, while lowering storage costs by 78.8% to 86.3% and communication costs by 91.4%, and never sharing raw documents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes