CLNov 4, 2025

CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency

Ehsan Aghazadeh, Ahmad Ghasemi, Hedyeh Beyhaghi, Hossein Pishro-Nik

arXiv:2511.02603v13 citationsh-index: 10

Originality Incremental advance

AI Analysis

This addresses the computational cost problem for users of LLMs in reasoning tasks, offering an incremental improvement over existing self-consistency methods.

The paper tackles the inefficiency of self-consistency in large language models by introducing Confidence-Guided Early Stopping (CGES), which reduces model calls by about 69% while maintaining accuracy within 0.06 percentage points across reasoning benchmarks.

Large language models (LLMs) are often queried multiple times at test time, with predictions aggregated by majority vote. While effective, this self-consistency strategy (arXiv:2203.11171) requires a fixed number of calls and can fail when the correct answer is rare. We introduce Confidence-Guided Early Stopping (CGES), a Bayesian framework that forms posteriors over candidate answers using scalar confidence signals derived from token probabilities or reward models. CGES adaptively halts sampling once the posterior mass of a candidate exceeds a threshold. We provide theoretical guarantees for both perfectly calibrated confidences and realistic noisy confidence signals. Across five reasoning benchmarks, CGES reduces the average number of model calls by about 69 percent (for example, from 16.0 to 4.9) while matching the accuracy of self-consistency within 0.06 percentage points.

View on arXiv PDF

Similar