CL AIDec 11, 2025

Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers

arXiv:2512.10422v34.91 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the issue of factual inaccuracies in LLM outputs for question answering, particularly in multi-hop and simple QA tasks, representing an incremental improvement over existing RAG methods.

The paper tackles the problem of incorrect retrievals and hallucinations in retrieval-augmented generation for question answering by proposing CoopRAG, a framework where a retriever and LLM cooperate through knowledge exchange and layer contrasting for ranking, resulting in consistent outperformance of state-of-the-art methods on multiple QA datasets.

Since large language models (LLMs) have a tendency to generate factually inaccurate output, retrieval-augmented generation (RAG) has gained significant attention as a key means to mitigate this downside of harnessing only LLMs. However, existing RAG methods for simple and multi-hop question answering (QA) are still prone to incorrect retrievals and hallucinations. To address these limitations, we propose CoopRAG, a novel RAG framework for the question answering task in which a retriever and an LLM work cooperatively with each other by exchanging informative knowledge, and the earlier and later layers of the retriever model work cooperatively with each other to accurately rank the retrieved documents relevant to a given query. In this framework, we (i) unroll a question into sub-questions and a reasoning chain in which uncertain positions are masked, (ii) retrieve the documents relevant to the question augmented with the sub-questions and the reasoning chain, (iii) rerank the documents by contrasting layers of the retriever, and (iv) reconstruct the reasoning chain by filling the masked positions via the LLM. Our experiments demonstrate that CoopRAG consistently outperforms state-of-the-art QA methods on three multi-hop QA datasets as well as a simple QA dataset in terms of both the retrieval and QA performances. Our code is available.

View on arXiv PDF

Similar