CLJun 4, 2025

ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations

arXiv:2506.03763v12 citationsh-index: 32ACL
Originality Incremental advance
AI Analysis

This addresses the challenge of enhancing mathematical reasoning capabilities in language models, which is crucial for applications in education and AI problem-solving, though it appears incremental as it builds on existing cloze exercise concepts.

The authors tackled the problem of improving mathematical reasoning in large language models by proposing ClozeMath, a fine-tuning approach based on text-infilling tasks that predict masked equations from solutions, which outperformed the Masked Thought baseline on datasets like GSM8K, MATH, and GSM-Symbolic.

The capabilities of large language models (LLMs) have been enhanced by training on data that reflects human thought processes, such as the Chain-of-Thought format. However, evidence suggests that the conventional scheme of next-word prediction may not fully capture how humans learn to think. Inspired by how humans generalize mathematical reasoning, we propose a new approach named ClozeMath to fine-tune LLMs for mathematical reasoning. Our ClozeMath involves a text-infilling task that predicts masked equations from a given solution, analogous to cloze exercises used in human learning. Experiments on GSM8K, MATH, and GSM-Symbolic show that ClozeMath surpasses the strong baseline Masked Thought in performance and robustness, with two test-time scaling decoding algorithms, Beam Search and Chain-of-Thought decoding. Additionally, we conduct an ablation study to analyze the effects of various architectural and implementation choices on our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes