AIJan 30

Decoding in Geometry: Alleviating Embedding-Space Crowding for Complex Reasoning

Peking U
arXiv:2601.22536v1h-index: 18
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in LLM decoding for complex reasoning tasks, offering a plug-and-play solution that is incremental over existing methods.

The paper tackles the problem of embedding-space crowding in large language models during sampling-based decoding, which is linked to reasoning success, and proposes CraEG, a geometry-guided reweighting method that improves generation performance with gains in robustness and diversity metrics.

Sampling-based decoding underlies complex reasoning in large language models (LLMs), where decoding strategies critically shape model behavior. Temperature- and truncation-based methods reshape the next-token distribution through global probability reweighting or thresholding to balance the quality-diversity tradeoff. However, they operate solely on token probabilities, ignoring fine-grained relationships among tokens in the embedding space. We uncover a novel phenomenon, embedding-space crowding, where the next-token distribution concentrates its probability mass on geometrically close tokens in the embedding space. We quantify crowding at multiple granularities and find a statistical association with reasoning success in mathematical problem solving. Motivated by this finding, we propose CraEG, a plug-and-play sampling method that mitigates crowding through geometry-guided reweighting. CraEG is training-free, single-pass, and compatible with standard sampling strategies. Experiments on multiple models and benchmarks demonstrate improved generation performance, with gains in robustness and diversity metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes