Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation
This addresses reliability issues in RAG systems for users relying on long-context reasoning, though it is incremental as it builds on existing RAG methods.
The paper tackled the problem of how the formatting of retrieved documents in Retrieval-Augmented Generation (RAG) affects accuracy and stability, showing that choices like delimiters can cause substantial performance shifts, and introduced Contextual Normalization to improve robustness and long-context utilization.
Retrieval-Augmented Generation (RAG) has become an essential approach for extending the reasoning and knowledge capacity of large language models (LLMs). While prior research has primarily focused on retrieval quality and prompting strategies, the influence of how the retrieved documents are framed, i.e., context format, remains underexplored. We show that seemingly superficial choices, such as delimiters or structural markers in key-value extraction, can induce substantial shifts in accuracy and stability, even when semantic content is identical. To systematically investigate this effect, we design controlled experiments that vary context density, delimiter styles, and positional placement, revealing the underlying factors that govern performance differences. Building on these insights, we introduce Contextual Normalization, a lightweight strategy that adaptively standardizes context representations before generation. Extensive experiments on both controlled and real-world RAG benchmarks across diverse settings demonstrate that the proposed strategy consistently improves robustness to order variation and strengthens long-context utilization. These findings underscore that reliable RAG depends not only on retrieving the right content, but also on how that content is presented, offering both new empirical evidence and a practical technique for better long-context reasoning.