CLFeb 18, 2025

Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions

Leonardo Ranaldi, Marco Valentino, Andrè Freitas

arXiv:2502.12616v222.634 citationsh-index: 14ACL

Originality Incremental advance

AI Analysis

This addresses robustness and faithfulness issues in reasoning for LLM users, but it is incremental as it builds on existing CoT and symbolic approaches.

The paper tackles the problem of content biases in Chain-of-Thought reasoning for Large Language Models by introducing QuaSAR, a quasi-symbolic abstraction method that disentangles content from logical reasoning without full formalization, resulting in up to 8% accuracy improvements on tasks like MMLU-Redux and GSM-Symbolic.

Chain-of-Though (CoT) represents a common strategy for reasoning in Large Language Models (LLMs) by decomposing complex tasks into intermediate inference steps. However, explanations generated via CoT are susceptible to content biases that negatively affect their robustness and faithfulness. To mitigate existing limitations, recent work has proposed using logical formalisms coupled with external symbolic solvers. However, fully symbolic approaches possess the bottleneck of requiring a complete translation from natural language to formal languages, a process that affects efficiency and flexibility. To achieve a trade-off, this paper investigates methods to disentangle content from logical reasoning without a complete formalisation. In particular, we present QuaSAR (for Quasi-Symbolic Abstract Reasoning), a variation of CoT that guides LLMs to operate at a higher level of abstraction via quasi-symbolic explanations. Our framework leverages the capability of LLMs to formalise only relevant variables and predicates, enabling the coexistence of symbolic elements with natural language. We show the impact of QuaSAR for in-context learning and for constructing demonstrations to improve the reasoning capabilities of smaller models. Our experiments show that quasi-symbolic abstractions can improve CoT-based methods by up to 8% accuracy, enhancing robustness and consistency on challenging adversarial variations on both natural language (i.e. MMLU-Redux) and symbolic reasoning tasks (i.e., GSM-Symbolic).

View on arXiv PDF

Similar