Factuality and Transparency Are All RAG Needs! Self-Explaining Contrastive Evidence Re-ranking
This addresses the need for more reliable and transparent RAG systems, particularly in safety-critical domains like healthcare, though it appears incremental as it builds on existing RAG and contrastive learning methods.
The paper tackles the problem of improving retrieval-augmented generation (RAG) systems by introducing Self-Explaining Contrastive Evidence Re-Ranking (CER), which restructures retrieval around factual evidence using contrastive learning and token-level attribution rationales, resulting in improved retrieval accuracy and reduced hallucinations in clinical trial reports.
This extended abstract introduces Self-Explaining Contrastive Evidence Re-Ranking (CER), a novel method that restructures retrieval around factual evidence by fine-tuning embeddings with contrastive learning and generating token-level attribution rationales for each retrieved passage. Hard negatives are automatically selected using a subjectivity-based criterion, forcing the model to pull factual rationales closer while pushing subjective or misleading explanations apart. As a result, the method creates an embedding space explicitly aligned with evidential reasoning. We evaluated our method on clinical trial reports, and initial experimental results show that CER improves retrieval accuracy, mitigates the potential for hallucinations in RAG systems, and provides transparent, evidence-based retrieval that enhances reliability, especially in safety-critical domains.