CLAug 26, 2025

Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

arXiv:2508.18824v13 citationsh-index: 9CIKM
Originality Highly original
AI Analysis

This addresses the problem of scalable and reliable data synthesis for improving mathematical reasoning in LLMs, representing a strong domain-specific advancement.

The paper tackled the challenge of generating high-quality training data for enhancing mathematical reasoning in large language models by proposing a program-assisted synthesis framework that produced 12.3 million problem-solving triples, resulting in models achieving state-of-the-art performance on benchmark datasets.

Enhancing the mathematical reasoning of large language models (LLMs) demands high-quality training data, yet conventional methods face critical challenges in scalability, cost, and data reliability. To address these limitations, we propose a novel program-assisted synthesis framework that systematically generates a high-quality mathematical corpus with guaranteed diversity, complexity, and correctness. This framework integrates mathematical knowledge systems and domain-specific tools to create executable programs. These programs are then translated into natural language problem-solution pairs and vetted by a bilateral validation mechanism that verifies solution correctness against program outputs and ensures program-problem consistency. We have generated 12.3 million such problem-solving triples. Experiments demonstrate that models fine-tuned on our data significantly improve their inference capabilities, achieving state-of-the-art performance on several benchmark datasets and showcasing the effectiveness of our synthesis approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes