CLOct 12, 2025

RECON: Reasoning with Condensation for Efficient Retrieval-Augmented Generation

arXiv:2510.10448v12 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of high costs and degraded performance in RAG systems for AI practitioners, representing an incremental improvement through a novel hybrid method.

The paper tackles inefficient context management in retrieval-augmented generation systems by introducing RECON, a framework that integrates a summarization module to compress evidence, reducing total context length by 35% and improving EM scores by up to 14.5% on QA benchmarks.

Retrieval-augmented generation (RAG) systems trained using reinforcement learning (RL) with reasoning are hampered by inefficient context management, where long, noisy retrieved documents increase costs and degrade performance. We introduce RECON (REasoning with CONdensation), a framework that integrates an explicit summarization module to compress evidence within the reasoning loop. Our summarizer is trained via a two-stage process: relevance pretraining on QA datasets, followed by multi-aspect distillation from proprietary LLMs to ensure factuality and clarity. Integrated into the Search-R1 pipeline, RECON reduces total context length by 35\%, leading to improved training speed and inference latency, while simultaneously improving RAG performance on downstream QA benchmarks. Notably, it boosts the average EM score of the 3B model by 14.5\% and the 7B model by 3.0\%, showing particular strength in multi-hop QA. RECON demonstrates that learned context compression is essential for building practical, scalable, and performant RAG systems. Our code implementation is made available at https://github.com/allfornancy/RECON.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes