LGCLIRMar 26, 2025

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

arXiv:2503.20201v145 citationsh-index: 55Has Code
Originality Incremental advance
AI Analysis

This democratizes search AI by providing an open-source alternative to proprietary solutions, though it is incremental in combining existing components like reasoning agents and search tools.

The paper tackles the gap between proprietary and open-source search AI by introducing Open Deep Search (ODS), a framework that augments open-source LLMs with reasoning agents and a novel web search tool. The result shows ODS nearly matches or surpasses state-of-the-art baselines, improving accuracy by 9.7% on the FRAMES benchmark compared to GPT-4o Search Preview.

We introduce Open Deep Search (ODS) to close the increasing gap between the proprietary search AI solutions, such as Perplexity's Sonar Reasoning Pro and OpenAI's GPT-4o Search Preview, and their open-source counterparts. The main innovation introduced in ODS is to augment the reasoning capabilities of the latest open-source LLMs with reasoning agents that can judiciously use web search tools to answer queries. Concretely, ODS consists of two components that work with a base LLM chosen by the user: Open Search Tool and Open Reasoning Agent. Open Reasoning Agent interprets the given task and completes it by orchestrating a sequence of actions that includes calling tools, one of which is the Open Search Tool. Open Search Tool is a novel web search tool that outperforms proprietary counterparts. Together with powerful open-source reasoning LLMs, such as DeepSeek-R1, ODS nearly matches and sometimes surpasses the existing state-of-the-art baselines on two benchmarks: SimpleQA and FRAMES. For example, on the FRAMES evaluation benchmark, ODS improves the best existing baseline of the recently released GPT-4o Search Preview by 9.7% in accuracy. ODS is a general framework for seamlessly augmenting any LLMs -- for example, DeepSeek-R1 that achieves 82.4% on SimpleQA and 30.1% on FRAMES -- with search and reasoning capabilities to achieve state-of-the-art performance: 88.3% on SimpleQA and 75.3% on FRAMES.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes