CRCLIRNov 10, 2025

A Decentralized Retrieval Augmented Generation System with Source Reliabilities Secured on Blockchain

arXiv:2511.07577v11 citationsh-index: 5Has Code
Originality Incremental advance
AI Analysis

This addresses the need for decentralized RAG systems to reduce costs and privacy concerns while managing source reliability, though it is incremental as it builds on existing RAG and blockchain concepts.

The paper tackles the problem of centralized retrieval-augmented generation (RAG) systems by proposing a decentralized system that dynamically scores source reliabilities using blockchain, achieving a +10.7% performance improvement over centralized counterparts in unreliable data environments and 56% cost savings.

Existing retrieval-augmented generation (RAG) systems typically use a centralized architecture, causing a high cost of data collection, integration, and management, as well as privacy concerns. There is a great need for a decentralized RAG system that enables foundation models to utilize information directly from data owners who maintain full control over their sources. However, decentralization brings a challenge: the numerous independent data sources vary significantly in reliability, which can diminish retrieval accuracy and response quality. To address this, our decentralized RAG system has a novel reliability scoring mechanism that dynamically evaluates each source based on the quality of responses it contributes to generate and prioritizes high-quality sources during retrieval. To ensure transparency and trust, the scoring process is securely managed through blockchain-based smart contracts, creating verifiable and tamper-proof reliability records without relying on a central authority. We evaluate our decentralized system with two Llama models (3B and 8B) in two simulated environments where six data sources have different levels of reliability. Our system achieves a +10.7\% performance improvement over its centralized counterpart in the real world-like unreliable data environments. Notably, it approaches the upper-bound performance of centralized systems under ideally reliable data environments. The decentralized infrastructure enables secure and trustworthy scoring management, achieving approximately 56\% marginal cost savings through batched update operations. Our code and system are open-sourced at github.com/yining610/Reliable-dRAG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes