CLOct 15, 2025

ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

arXiv:2510.14077v14 citationsh-index: 3Proceedings of the 2nd Workshop on Uncertainty-Aware NLP (UncertaiNLP 2025)
Originality Highly original
AI Analysis

This addresses a severe challenge to real-world usability of conversational AI by improving accuracy and reliability in multi-turn interactions.

The paper tackles performance degradation in multi-turn conversations with large language models when information is revealed incrementally, and introduces ERGO, an entropy-guided method that yields a 56.6% average performance gain, increases aptitude by 24.7%, and decreases unreliability by 35.3%.

Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi-turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes