CLJun 2, 2023

Learning Multi-Step Reasoning by Solving Arithmetic Tasks

arXiv:2306.01707v3229 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses the challenge of making multi-step reasoning more accessible for smaller language models, which could benefit applications requiring mathematical problem-solving without extensive computational resources, though it is incremental as it builds on existing chain-of-thought methods.

The paper tackles the problem of enabling smaller language models to perform multi-step reasoning, which typically requires large models, by continually pre-training them on a synthetic dataset of multi-step arithmetic tasks, resulting in enhanced performance on four math word problem datasets.

Mathematical reasoning is regarded as a necessary ability for Language Models (LMs). Recent works demonstrate large LMs' impressive performance in solving math problems. The success is attributed to their Chain-of-Thought (CoT) reasoning abilities, i.e., the ability to decompose complex questions into step-by-step reasoning chains, but such ability seems only to emerge from models with abundant parameters. This work investigates how to incorporate relatively small LMs with the capabilities of multi-step reasoning. We propose to inject such abilities by continually pre-training LMs on a synthetic dataset MsAT which is composed of Multi-step Arithmetic Tasks. Our experiments on four math word problem datasets show the effectiveness of the proposed method in enhancing LMs' math reasoning abilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes