CL IRMar 12, 2025

ClaimTrust: Propagation Trust Scoring for RAG Systems

arXiv:2503.10702v12 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses trustworthiness in RAG systems for content generation, but it is incremental as it builds on existing methods like PageRank and claim extraction.

The paper tackles the problem of ensuring trustworthiness in retrieved information for RAG systems by introducing ClaimTrust, a propagation-based trust scoring framework that uses a modified PageRank-inspired algorithm on document graphs; it achieves a 11.2% significant connection rate and successfully assigns higher trust scores to verified documents while penalizing false ones.

The rapid adoption of retrieval-augmented generation (RAG) systems has revolutionized large-scale content generation but has also highlighted the challenge of ensuring trustworthiness in retrieved information. This paper introduces ClaimTrust, a propagation-based trust scoring framework that dynamically evaluates the reliability of documents in a RAG system. Using a modified PageRank-inspired algorithm, ClaimTrust propagates trust scores across documents based on relationships derived from extracted factual claims. We preprocess and analyze 814 political news articles from Kaggle's Fake News Detection Dataset to extract 2,173 unique claims and classify 965 meaningful relationships (supporting or contradicting). By representing the dataset as a document graph, ClaimTrust iteratively updates trust scores until convergence, effectively differentiating trustworthy articles from unreliable ones. Our methodology, which leverages embedding-based filtering for efficient claim comparison and relationship classification, achieves a 11.2% of significant connections while maintaining computational scalability. Experimental results demonstrate that ClaimTrust successfully assigns higher trust scores to verified documents while penalizing those containing false information. Future directions include fine-tuned claim extract and compare (Li et al., 2022), parameter optimization, enhanced language model utilization, and robust evaluation metrics to generalize the framework across diverse datasets and domains.

View on arXiv PDF

Similar