TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL
This addresses efficiency issues in AI reasoning for users of large language models, representing a novel method for a known bottleneck.
The paper tackles the problem of computational redundancy in large language models' reasoning by introducing the Minimal Sufficient Length (MSL) metric to define the shortest reasoning length that preserves correctness, and proposes TRiMS, which reduces chain-of-thought tokens by over 80% while slightly improving accuracy across benchmarks.
Large language models achieve breakthroughs in complex reasoning via long chain-of-thought sequences. However, this often leads to severe reasoning inflation, causing substantial computational redundancy. To maximize Intelligence per Token, we introduce a theoretical metric, MSL-Minimal Sufficient Length. MSL rigorously characterizes the shortest reasoning length that preserves answer correctness. We provide a recursive definition based on independently sampled sequences and prove the existence of its limit, establishing the first measurable lower bound for reasoning-chain compression. Building on an analysis of mainstream CoT compression strategies, we identify key structural factors enabling a model to approach MSL. Based on these insights, we propose TRiMS which employs the GRPO algorithm in conjunction with MSL-based estimation during training, while mitigating instabilities during the training process through dynamic batch aggregation and advantage computation using batch-level standard deviation. TRiMS achieves over 80% CoT token reduction with a minor accuracy boost across all benchmarks.