LGAIOct 12, 2024

Boosting Deductive Reasoning with Step Signals In RLHF

arXiv:2410.09528v24 citationsh-index: 19
AI Analysis

This work addresses the problem of enhancing logical reasoning in LLMs for complex tasks, representing an incremental advancement in dataset generation and training methods.

The paper tackles the challenge of multi-step deductive reasoning in Large Language Models by developing MuseD, an automated method for generating training and testing datasets with controlled complexity, which through RLHF training led to significant improvements in logical capabilities for both in-domain and out-of-domain tasks.

Logical reasoning is a crucial task for Large Language Models (LLMs), enabling them to tackle complex problems. Among reasoning tasks, multi-step reasoning poses a particular challenge. Grounded in the theory of formal logic, we have developed an automated method, Multi-step Deduction (MuseD), for deductive reasoning data. MuseD has allowed us to create training and testing datasets for multi-step reasoning. Our generation method enables control over the complexity of the generated instructions, facilitating training and evaluation of models across different difficulty levels. Through RLHF training, our training data has demonstrated significant improvements in logical capabilities for both in-domain of out-of-domain reasoning tasks. Additionally, we have conducted tests to assess the multi-step reasoning abilities of various models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes