AI CLJul 1, 2024

From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models

Shubhra Mishra, Gabriel Poesia, Noah D. Goodman

arXiv:2407.00900v39.66 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This work provides empirical insights into training dynamics for reasoning in LLMs, which is incremental but addresses a foundational problem for AI researchers.

The authors analyzed how mathematical reasoning abilities evolve in large language models during pre-training and post-training, finding that skills are learned in an order correlating with a human curriculum and identifying which abilities benefit or suffer from instruction tuning.

Large Language Models (LLMs) solely trained on next-token prediction learn to solve a wide range of problems involving mathematical reasoning. But how does this ability evolve during training? We show the first analysis of how mathematical reasoning abilities of several open-weight LLMs develop during pre-training and post-training. To this end, we construct MathCAMPS, a synthetic dataset of novel mathematical reasoning problems grounded in 44 fine-grained skills taken from the Common Core curriculum from K to 8th grades. In one experiment, we show that mathematical skills are learned during pre-training in an order that measurably correlates with the human-designed curriculum, even though training data are randomly ordered. We also show a detailed analysis of which mathematical abilities benefit from instruction tuning, a widely used post-training method and, in contrast, which skills suffer. Our work paves the way for an empirical understanding of LLM training dynamics in relation to reasoning.

View on arXiv PDF Code

Similar