AI CL IRMar 7, 2025

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Huatong Song, Jinhao Jiang, Yingqian Min, Jie Chen, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, Ji-Rong Wen

arXiv:2503.05592v2234 citationsh-index: 25

Originality Highly original

AI Analysis

This addresses the issue of inaccuracies and hallucinations in LLMs for users needing up-to-date or extensive external knowledge, representing a novel method rather than an incremental improvement.

The paper tackles the problem of LLMs relying on internal knowledge for time-sensitive or knowledge-intensive questions by proposing R1-Searcher, a two-stage RL approach that enhances search capabilities, resulting in significant performance improvements over previous RAG methods and even closed-source GPT-4o-mini.

Existing Large Reasoning Models (LRMs) have shown the potential of reinforcement learning (RL) to enhance the complex reasoning capabilities of Large Language Models~(LLMs). While they achieve remarkable performance on challenging tasks such as mathematics and coding, they often rely on their internal knowledge to solve problems, which can be inadequate for time-sensitive or knowledge-intensive questions, leading to inaccuracies and hallucinations. To address this, we propose \textbf{R1-Searcher}, a novel two-stage outcome-based RL approach designed to enhance the search capabilities of LLMs. This method allows LLMs to autonomously invoke external search systems to access additional knowledge during the reasoning process. Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start. % effectively generalizing to out-of-domain datasets and supporting both Base and Instruct models. Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.

View on arXiv PDF

Similar