CLMay 13, 2025

AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

arXiv:2505.08311v229 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This work demonstrates that the open-source community can achieve high performance at a practical scale for deployment, potentially benefiting developers and researchers seeking accessible, mid-scale reasoning models.

The authors tackled the challenge of advancing reasoning capabilities in language models at a 32B scale, achieving state-of-the-art scores such as 85.3 on AIME 2024 and 70.3 on LiveCodeBench, outperforming or rivaling other models like DeepSeek-R1 and Qwen3-235B-A22B.

We present AM-Thinking-v1, a 32B dense language model that advances the frontier of reasoning, embodying the collaborative spirit of open-source innovation. Outperforming DeepSeek-R1 and rivaling leading Mixture-of-Experts (MoE) models like Qwen3-235B-A22B and Seed1.5-Thinking, AM-Thinking-v1 achieves impressive scores of 85.3 on AIME 2024, 74.4 on AIME 2025, and 70.3 on LiveCodeBench, showcasing state-of-the-art mathematical and coding capabilities among open-source models of similar scale. Built entirely from the open-source Qwen2.5-32B base model and publicly available queries, AM-Thinking-v1 leverages a meticulously crafted post-training pipeline - combining supervised fine-tuning and reinforcement learning - to deliver exceptional reasoning capabilities. This work demonstrates that the open-source community can achieve high performance at the 32B scale, a practical sweet spot for deployment and fine-tuning. By striking a balance between top-tier performance and real-world usability, we hope AM-Thinking-v1 inspires further collaborative efforts to harness mid-scale models, pushing reasoning boundaries while keeping accessibility at the core of innovation. We have open-sourced our model on \href{https://huggingface.co/a-m-team/AM-Thinking-v1}{Hugging Face}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes