CLAIOct 25, 2025

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

arXiv:2510.22115v226 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the problem of computational inefficiency in large-scale reasoning models for AI researchers and practitioners, representing a significant but incremental advancement in scaling techniques.

The paper tackles the challenge of scaling reasoning-oriented language models efficiently by introducing Ling 2.0, a series of models that use a Mixture-of-Experts paradigm to achieve up to 7-fold active-compute efficiency and establish a new Pareto frontier for reasoning accuracy versus computational efficiency at the trillion-parameter scale.

We introduce Ling 2.0, a series reasoning-oriented language foundation built upon the principle that every activation boosts reasoning capability. Designed to scale from tens of billions to one trillion parameters under a unified Mixture-of-Experts (MoE) paradigm, Ling 2.0 emphasizes high sparsity, cross-scale consistency, and efficiency guided by empirical scaling laws. The series includes three non-thinking (instruct) models - Ling-mini-2.0, Ling-flash-2.0, and Ling-1T - ranging from 16B to 1T total parameters and achieving up to 7-fold active-compute efficiency compared with dense counterparts. Ling 2.0 integrates coordinated innovations across model architecture, pre-training, post-training, and infrastructure: a high-sparsity MoE with MTP for efficient reasoning, reasoning-oriented data and mid-training CoT activation, reinforcement-based fine-tuning (DFT, Evo-CoT), and full-scale FP8 training with fine-grained heterogeneous pipelines. At the trillion scale, Ling-1T establishes a new Pareto frontier of reasoning accuracy versus computational efficiency, demonstrating that sparse activation, when properly aligned with reasoning objectives, enables scalable and efficient intelligence. Collectively, Ling 2.0 provides a coherent, open, and efficient foundation for advancing future reasoning and thinking models, including the Ring series built upon the same base.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes