CLApr 1, 2025

Self-Routing RAG: Binding Selective Retrieval with Knowledge Verbalization

arXiv:2504.01018v25 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses efficiency and accuracy issues in RAG systems for AI applications, representing an incremental advance in selective retrieval methods.

The paper tackles the problem of suboptimal retrieval decisions in retrieval-augmented generation by proposing Self-Routing RAG, which enables large language models to dynamically choose between external retrieval and internal knowledge verbalization, resulting in a 29% reduction in retrievals and a 5.1% performance improvement over baselines.

Selective retrieval improves the accuracy and efficiency of retrieval-augmented generation (RAG) by reducing distractions from low-quality retrievals. However, existing approaches underutilize the inherent knowledge of large language models (LLMs), leading to suboptimal retrieval decisions and degraded generation performance. To bridge this gap, we propose Self-Routing RAG (SR-RAG), a novel framework that binds selective retrieval with knowledge verbalization. SR-RAG enables an LLM to dynamically decide whether to retrieve external knowledge or verbalize its own parametric knowledge. To this end, we design a multi-task objective that jointly optimizes an LLM for knowledge source selection, knowledge verbalization, and response generation. SR-RAG further incorporates a nearest neighbor search mechanism at inference time to improve the accuracy of knowledge source decisions under domain shifts. Fine-tuning three LLMs with SR-RAG significantly improves both their response accuracy and reduces the inference latency. Compared to the strongest selective retrieval baseline, SR-RAG reduces the number of retrievals by 29% while improving performance by 5.1%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes