CLFeb 10

Knowledge Integration Decay in Search-Augmented Reasoning of Large Language Models

Sangwon Yu, Ik-hwan Kim, Donghun Kang, Bongkyu Hwang, Junhwa Choi, Suk-hoon Jung, Seungki Hong, Taehee Lee, Sungroh Yoon

arXiv:2602.09517v10.6h-index: 7

Originality Incremental advance

AI Analysis

This addresses a bottleneck in knowledge integration for agentic LLMs, offering a lightweight solution, though it is incremental as it builds on existing search-augmented reasoning paradigms.

The paper tackles the problem of Knowledge Integration Decay (KID) in search-augmented reasoning of Large Language Models, where models fail to integrate retrieved evidence as reasoning length increases, and proposes Self-Anchored Knowledge Encoding (SAKE) to mitigate this, improving performance on multi-hop QA and complex reasoning benchmarks.

Modern Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks by employing search-augmented reasoning to incorporate external knowledge into long chains of thought. However, we identify a critical yet underexplored bottleneck in this paradigm, termed Knowledge Integration Decay (KID). Specifically, we observe that as the length of reasoning generated before search grows, models increasingly fail to integrate retrieved evidence into subsequent reasoning steps, limiting performance even when relevant information is available. To address this, we propose Self-Anchored Knowledge Encoding (SAKE), a training-free inference-time strategy designed to stabilize knowledge utilization. By anchoring retrieved knowledge at both the beginning and end of the reasoning process, SAKE prevents it from being overshadowed by prior context, thereby preserving its semantic integrity. Extensive experiments on multi-hop QA and complex reasoning benchmarks demonstrate that SAKE significantly mitigates KID and improves performance, offering a lightweight yet effective solution for knowledge integration in agentic LLMs.

View on arXiv PDF

Similar