AIDec 5, 2024

Enhancing Mathematical Reasoning in LLMs with Background Operators

arXiv:2412.04110v11 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses mathematical reasoning for LLMs, presenting an incremental method that combines existing techniques like Prolog and self-training.

The authors tackled the problem of improving mathematical reasoning in large language models by using background operators and a Prolog-based approach, achieving accuracies of 84.6% on a cross-validated set and 84.8% on a test set with the Meta-Llama-3.1-8B-Instruct model.

We propose utilizing background operators for mathematical reasoning in large language models (LLMs). To achieve this, we define a set of fundamental mathematical predicates as the basic building blocks. For each mathematical problem, we develop a Prolog solution that includes problem-specific predicates and intermediate predicates derived from these background operators, ensuring that each solution adheres to the defined operator set. We introduce the MATH-Prolog corpus, which is derived from the counting and probability categories of the MATH corpus. For efficient data augmentation, we apply K-fold cross-validated self-training. This method incrementally generates new Prolog solutions for each fold, incorporating those verified as correct into the training set throughout the model training process. Our experimental results demonstrate that 5-fold crossvalidated self-training effectively identifies new, accurate Prolog solutions, achieving an accuracy of 84.6% on the cross-validated set, and 84.8% on the test set during fine-tuning the Meta-Llama-3.1-8B-Instruct model. This approach successfully uncovers new solutions with fully computable inference steps for previously unseen problems. Additionally, incorporating the background mathematical predicates into the prompt enhances solution coverage.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes