CLAILGNov 10, 2025

Think Consistently, Reason Efficiently: Energy-Based Calibration for Implicit Chain-of-Thought

arXiv:2511.07124v11 citationsh-index: 2
AI Analysis

This addresses inconsistent reasoning in LLMs for multi-step reasoning tasks, representing an incremental improvement over existing implicit CoT methods.

The paper tackles the problem of inconsistent reasoning in implicit Chain-of-Thought methods for Large Language Models by proposing EBM-CoT, an Energy-Based Calibration framework that refines latent thought representations. The result is significant improvements in reasoning accuracy and consistency across mathematical, commonsense, and symbolic reasoning benchmarks.

Large Language Models (LLMs) have demonstrated strong reasoning capabilities through \emph{Chain-of-Thought} (CoT) prompting, which enables step-by-step intermediate reasoning. However, explicit CoT methods rely on discrete token-level reasoning processes that are prone to error propagation and limited by vocabulary expressiveness, often resulting in rigid and inconsistent reasoning trajectories. Recent research has explored implicit or continuous reasoning in latent spaces, allowing models to perform internal reasoning before generating explicit output. Although such approaches alleviate some limitations of discrete CoT, they generally lack explicit mechanisms to enforce consistency among reasoning steps, leading to divergent reasoning paths and unstable outcomes. To address this issue, we propose EBM-CoT, an Energy-Based Chain-of-Thought Calibration framework that refines latent thought representations through an energy-based model (EBM). Our method dynamically adjusts latent reasoning trajectories toward lower-energy, high-consistency regions in the embedding space, improving both reasoning accuracy and consistency without modifying the base language model. Extensive experiments across mathematical, commonsense, and symbolic reasoning benchmarks demonstrate that the proposed framework significantly enhances the consistency and efficiency of multi-step reasoning in LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes