CL AIMay 27, 2025

Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

Yang He, Xiao Ding, Bibo Cai, Yufei Zhang, Kai Xiong, Zhouhao Sun, Bing Qin, Ting Liu

arXiv:2505.20664v113.98 citationsh-index: 15

Originality Incremental advance

AI Analysis

This addresses efficiency issues for users of large language models in resource-constrained environments, though it is incremental as it builds on existing reasoning paradigms.

The paper tackles the problem of inefficient token consumption in reasoning-augmented large language models (RLLMs) when applied to simpler tasks, proposing Self-Route, a dynamic reasoning framework that automatically switches between general and reasoning modes based on capability estimation, achieving comparable accuracy while reducing token consumption by 30-55% across benchmarks.

While reasoning-augmented large language models (RLLMs) significantly enhance complex task performance through extended reasoning chains, they inevitably introduce substantial unnecessary token consumption, particularly for simpler problems where Short Chain-of-Thought (Short CoT) suffices. This overthinking phenomenon leads to inefficient resource usage without proportional accuracy gains. To address this issue, we propose Self-Route, a dynamic reasoning framework that automatically selects between general and reasoning modes based on model capability estimation. Our approach introduces a lightweight pre-inference stage to extract capability-aware embeddings from hidden layer representations, enabling real-time evaluation of the model's ability to solve problems. We further construct Gradient-10K, a model difficulty estimation-based dataset with dense complexity sampling, to train the router for precise capability boundary detection. Extensive experiments demonstrate that Self-Route achieves comparable accuracy to reasoning models while reducing token consumption by 30-55\% across diverse benchmarks. The proposed framework demonstrates consistent effectiveness across models with different parameter scales and reasoning paradigms, highlighting its general applicability and practical value.

View on arXiv PDF

Similar