Tongyi DeepResearch Technical Report
This work addresses the need for autonomous deep research capabilities in AI, though it appears incremental as it builds on existing agentic and training frameworks.
The paper tackles the problem of long-horizon, deep information-seeking research tasks by developing Tongyi DeepResearch, an agentic large language model that achieves state-of-the-art performance across multiple benchmarks, such as Humanity's Last Exam and BrowseComp, with 30.5 billion total parameters and only 3.3 billion activated per token.
We present Tongyi DeepResearch, an agentic large language model, which is specifically designed for long-horizon, deep information-seeking research tasks. To incentivize autonomous deep research agency, Tongyi DeepResearch is developed through an end-to-end training framework that combines agentic mid-training and agentic post-training, enabling scalable reasoning and information seeking across complex tasks. We design a highly scalable data synthesis pipeline that is fully automatic, without relying on costly human annotation, and empowers all training stages. By constructing customized environments for each stage, our system enables stable and consistent interactions throughout. Tongyi DeepResearch, featuring 30.5 billion total parameters, with only 3.3 billion activated per token, achieves state-of-the-art performance across a range of agentic deep research benchmarks, including Humanity's Last Exam, BrowseComp, BrowseComp-ZH, WebWalkerQA, xbench-DeepSearch, FRAMES and xbench-DeepSearch-2510. We open-source the model, framework, and complete solutions to empower the community.