CLAILGFeb 18, 2025

Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models

arXiv:2502.12855v11 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the challenge of specialized mathematical reasoning for smaller models, representing an incremental improvement over existing methods like knowledge distillation and data augmentation.

The paper tackled the problem of improving mathematical reasoning in smaller models by integrating a programmatically generated arithmetic dataset, resulting in enhanced arithmetic capabilities and improved performance on reasoning benchmarks.

While large models pre-trained on high-quality data exhibit excellent performance across various reasoning tasks, including mathematical reasoning (e.g. GSM8k, MultiArith), specializing smaller models to excel at mathematical reasoning remains a challenging problem. Common approaches to address this challenge include knowledge distillation, where smaller student models learn from large pre-trained teacher models, and data augmentation, such as rephrasing questions. Despite these efforts, smaller models struggle with arithmetic computations, leading to errors in mathematical reasoning. In this work, we focus on leveraging a programmatically generated arithmetic dataset to enhance the reasoning capabilities of smaller models. We investigate two key approaches to incorporate this dataset -- (1) intermediate fine-tuning, where a model is fine-tuned on the arithmetic dataset before being trained on a reasoning dataset, and (2) integrating the arithmetic dataset into the instruction-tuning mixture, allowing the model to learn arithmetic skills alongside general instruction-following abilities. Our experiments on multiple reasoning benchmarks demonstrate that incorporating an arithmetic dataset, whether through targeted fine-tuning or within the instruction-tuning mixture, enhances the models' arithmetic capabilities, which in turn improves their mathematical reasoning performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes