LG AIAug 9, 2024

InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

Bo-Wen Zhang, Yan Yan, Lin Li, Guang Liu

arXiv:2408.07089v118.216 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of high computational costs and data dependency in creating large-scale mathematical reasoning datasets for language and code models, though it appears incremental as it builds on existing CoT and PoT methods.

The paper tackles the challenge of scalable dataset creation for mathematical reasoning by introducing InfinityMATH, a programmatic instruction tuning dataset that decouples numbers from problems to enable efficient scaling, resulting in fine-tuned models showing relative improvements of 184.7% to 514.3% on benchmarks.

Recent advancements in Chain-of-Thoughts (CoT) and Program-of-Thoughts (PoT) methods have greatly enhanced language models' mathematical reasoning capabilities, facilitating their integration into instruction tuning datasets with LLMs. However, existing methods for large-scale dataset creation require substantial seed data and high computational costs for data synthesis, posing significant challenges for scalability. We introduce InfinityMATH, a scalable instruction tuning dataset for programmatic mathematical reasoning. The construction pipeline emphasizes decoupling numbers from mathematical problems to synthesize number-independent programs, enabling efficient and flexible scaling while minimizing dependency on specific numerical values. Fine-tuning experiments with open-source language and code models, such as Llama2 and CodeLlama, demonstrate the practical benefits of InfinityMATH. These fine-tuned models, showed significant relative improvements on both in-domain and out-of-domain benchmarks, ranging from 184.7% to 514.3% on average. Additionally, these models exhibited high robustness on the GSM8K+ and MATH+ benchmarks, which are enhanced version of test sets with simply the number variations. InfinityMATH ensures that models are more versatile and effective across a broader range of mathematical problems. The data is available at https://huggingface.co/datasets/flagopen/InfinityMATH.

View on arXiv PDF

Similar