CLJun 4, 2025

ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations

Quang Hieu Pham, Thuy Duong Nguyen, Tung Pham, Anh Tuan Luu, Dat Quoc Nguyen

arXiv:2506.03763v16.72 citationsh-index: 32ACL

Originality Incremental advance

AI Analysis

This addresses the challenge of enhancing mathematical reasoning capabilities in language models, which is crucial for applications in education and AI problem-solving, though it appears incremental as it builds on existing cloze exercise concepts.

The authors tackled the problem of improving mathematical reasoning in large language models by proposing ClozeMath, a fine-tuning approach based on text-infilling tasks that predict masked equations from solutions, which outperformed the Masked Thought baseline on datasets like GSM8K, MATH, and GSM-Symbolic.

The capabilities of large language models (LLMs) have been enhanced by training on data that reflects human thought processes, such as the Chain-of-Thought format. However, evidence suggests that the conventional scheme of next-word prediction may not fully capture how humans learn to think. Inspired by how humans generalize mathematical reasoning, we propose a new approach named ClozeMath to fine-tune LLMs for mathematical reasoning. Our ClozeMath involves a text-infilling task that predicts masked equations from a given solution, analogous to cloze exercises used in human learning. Experiments on GSM8K, MATH, and GSM-Symbolic show that ClozeMath surpasses the strong baseline Masked Thought in performance and robustness, with two test-time scaling decoding algorithms, Beam Search and Chain-of-Thought decoding. Additionally, we conduct an ablation study to analyze the effects of various architectural and implementation choices on our approach.

View on arXiv PDF

Similar