CLDec 16, 2022

Evaluating Step-by-Step Reasoning through Symbolic Verification

Yi-Fan Zhang, Hanlin Zhang, Li Erran Li, Eric Xing

arXiv:2212.08686v27.936 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving and verifying step-by-step reasoning in language models for AI research, though it appears incremental by combining existing neuro-symbolic and verification techniques.

The paper tackles the problem of understanding language models' reasoning mechanisms by creating synthetic datasets with natural and symbolic pairs, enabling automated verification of intermediate steps. The result shows that their neuro-symbolic approach, LMLP, achieves over 25% higher accuracy than chain-of-thought methods on deductive reasoning benchmarks, even with smaller models.

Pre-trained language models (LMs) have shown remarkable reasoning performance using explanations or chain-of-thoughts (CoT)) for in-context learning. On the other hand, these reasoning tasks are usually presumed to be more approachable for symbolic programming. To understand the mechanism of reasoning of LMs, we curate synthetic datasets containing equivalent (natural, symbolic) data pairs, where symbolic examples contain first-order logic rules and predicates from non-parametric knowledge bases (KBs), supporting automated verification of intermediate reasoning results. Then we revisit neuro-symbolic approaches and propose to learn from demonstrations containing logic rules and corresponding examples to iteratively reason over KBs, recovering Prolog's backward chaining algorithm and supporting automated verification of LMs' outputs. Comprehensive experiments are included to systematically compare LMLP with CoT in deductive reasoning settings, showing that LMLP enjoys more than $25\%$ higher accuracy than CoT on length generalization benchmarks even with smaller model sizes.

View on arXiv PDF Code

Similar