CLDec 16, 2024

CoinMath: Harnessing the Power of Coding Instruction for Math LLMs

arXiv:2412.11699v11 citationsh-index: 21ACL
Originality Incremental advance
AI Analysis

It addresses the challenge of improving math problem-solving in LLMs for researchers and developers, but is incremental as it builds on existing code-based methods.

This study tackled the problem of optimizing coding instruction data to enhance mathematical reasoning in Large Language Models, finding that code-based rationales with concise comments, descriptive naming, and hardcoded solutions are beneficial, and proposing CoinMath, which significantly outperforms the baseline MAmmoTH model.

Large Language Models (LLMs) have shown strong performance in solving mathematical problems, with code-based solutions proving particularly effective. However, the best practice to leverage coding instruction data to enhance mathematical reasoning remains underexplored. This study investigates three key questions: (1) How do different coding styles of mathematical code-based rationales impact LLMs' learning performance? (2) Can general-domain coding instructions improve performance? (3) How does integrating textual rationales with code-based ones during training enhance mathematical reasoning abilities? Our findings reveal that code-based rationales with concise comments, descriptive naming, and hardcoded solutions are beneficial, while improvements from general-domain coding instructions and textual rationales are relatively minor. Based on these insights, we propose CoinMath, a learning strategy designed to enhance mathematical reasoning by diversifying the coding styles of code-based rationales. CoinMath generates a variety of code-based rationales incorporating concise comments, descriptive naming conventions, and hardcoded solutions. Experimental results demonstrate that CoinMath significantly outperforms its baseline model, MAmmoTH, one of the SOTA math LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes