LGAIAug 9, 2024

InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

arXiv:2408.07089v116 citationsh-index: 10Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of high computational costs and data dependency in creating large-scale mathematical reasoning datasets for language and code models, though it appears incremental as it builds on existing CoT and PoT methods.

The paper tackles the challenge of scalable dataset creation for mathematical reasoning by introducing InfinityMATH, a programmatic instruction tuning dataset that decouples numbers from problems to enable efficient scaling, resulting in fine-tuned models showing relative improvements of 184.7% to 514.3% on benchmarks.

Recent advancements in Chain-of-Thoughts (CoT) and Program-of-Thoughts (PoT) methods have greatly enhanced language models' mathematical reasoning capabilities, facilitating their integration into instruction tuning datasets with LLMs. However, existing methods for large-scale dataset creation require substantial seed data and high computational costs for data synthesis, posing significant challenges for scalability. We introduce InfinityMATH, a scalable instruction tuning dataset for programmatic mathematical reasoning. The construction pipeline emphasizes decoupling numbers from mathematical problems to synthesize number-independent programs, enabling efficient and flexible scaling while minimizing dependency on specific numerical values. Fine-tuning experiments with open-source language and code models, such as Llama2 and CodeLlama, demonstrate the practical benefits of InfinityMATH. These fine-tuned models, showed significant relative improvements on both in-domain and out-of-domain benchmarks, ranging from 184.7% to 514.3% on average. Additionally, these models exhibited high robustness on the GSM8K+ and MATH+ benchmarks, which are enhanced version of test sets with simply the number variations. InfinityMATH ensures that models are more versatile and effective across a broader range of mathematical problems. The data is available at https://huggingface.co/datasets/flagopen/InfinityMATH.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes