Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval
This addresses inefficiencies in RAG systems for real-world applications like question-answering, though it is incremental as it builds on existing RAG methods.
The paper tackles the problem of inefficient retrieval in Retrieval-Augmented Generation (RAG) by proposing Probing-RAG, which uses hidden state representations to adaptively decide when to retrieve documents, resulting in improved performance on open-domain QA datasets and reduced redundant retrievals.
Retrieval-Augmented Generation (RAG) enhances language models by retrieving and incorporating relevant external knowledge. However, traditional retrieve-and-generate processes may not be optimized for real-world scenarios, where queries might require multiple retrieval steps or none at all. In this paper, we propose a Probing-RAG, which utilizes the hidden state representations from the intermediate layers of language models to adaptively determine the necessity of additional retrievals for a given query. By employing a pre-trained prober, Probing-RAG effectively captures the model's internal cognition, enabling reliable decision-making about retrieving external documents. Experimental results across five open-domain QA datasets demonstrate that Probing-RAG outperforms previous methods while reducing the number of redundant retrieval steps.