CLNov 18, 2025

Entropy-Guided Reasoning Compression

Hourun Zhu, Yang Gao, Wenlong Fei, Jiawei Li, Huashan Sun

arXiv:2511.14258v12 citations

Originality Incremental advance

AI Analysis

This work addresses a practical bottleneck in deploying reasoning models by reducing computation costs and improving deployability, representing an incremental improvement over existing compression methods.

The paper tackles the problem of excessive reasoning length in large reasoning models by addressing the entropy conflict during compression training, achieving a compression to 20% of the original length while maintaining or improving accuracy on six mathematical benchmarks.

Large reasoning models have demonstrated remarkable performance on complex reasoning tasks, yet the excessive length of their chain-of-thought outputs remains a major practical bottleneck due to high computation cost and poor deployability. Existing compression methods have achieved partial success but overlook a crucial phenomenon in the training process -- the entropy conflict. During compression training, entropy decreases, leading to shorter reasoning but limited exploration, while accuracy-oriented objectives increase entropy, lengthening reasoning chains. This can cause the model to get stuck in a local dilemma. Our analysis further reveals the origin of the entropy conflict: many high-entropy tokens are logical connectors that receive larger gradients and are encouraged under the performance objective, while the compression objective simultaneously penalizes these potentially redundant connectors. This opposing pressure creates a direct source of entropy conflict. To address these issues, we adopt an entropy-guided training framework. As entropy descends, the model is guided toward efficient reasoning by encouraging concise thought steps; as entropy rises, exploration is reinforced under the compact reasoning mode to improve robustness. Experiments on six mathematical benchmarks show that our method compresses reasoning length to 20% of the original while maintaining or even surpassing baseline accuracy. Code and models will be released publicly.

View on arXiv PDF

Similar