CLAug 2, 2025

WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework

arXiv:2508.01245v15 citationsh-index: 28
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing mathematical problem-solving abilities in large language models for AI applications, representing an incremental advancement over existing data augmentation methods.

The paper tackles the problem of limited mathematical performance in large language models due to insufficient training data by proposing WarriorMath, a defect-aware framework that integrates targeted data synthesis and progressive training, resulting in an average performance improvement of 12.57% across six benchmarks and setting a new state-of-the-art.

Large Language Models (LLMs) excel in solving mathematical problems, yet their performance is often limited by the availability of high-quality, diverse training data. Existing methods focus on augmenting datasets through rephrasing or difficulty progression but overlook the specific failure modes of LLMs. This results in synthetic questions that the model can already solve, providing minimal performance gains. To address this, we propose WarriorMath, a defect-aware framework for mathematical problem solving that integrates both targeted data synthesis and progressive training. In the synthesis stage, we employ multiple expert LLMs in a collaborative process to generate, critique, and refine problems. Questions that base LLMs fail to solve are identified and iteratively improved through expert-level feedback, producing high-quality, defect-aware training data. In the training stage, we introduce a progressive learning framework that iteratively fine-tunes the model using increasingly challenging data tailored to its weaknesses. Experiments on six mathematical benchmarks show that WarriorMath outperforms strong baselines by 12.57% on average, setting a new state-of-the-art. Our results demonstrate the effectiveness of a defect-aware, multi-expert framework for improving mathematical ability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes