LG AINov 5, 2025

Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

Bo Zhao, Berkcan Kapusuzoglu, Kartik Balasubramaniam, Sambit Sahu, Supriyo Chakraborty, Genta Indra Winata

arXiv:2511.03808v14.12 citationsh-index: 42

Originality Incremental advance

AI Analysis

This addresses the problem of cost-efficient deployment for reasoning models, but it is incremental as it builds on existing routing and prediction techniques.

The paper tackles the high computational cost of deploying large reasoning language models by proposing a routing method that assigns problems to the smallest model likely to solve them, reducing compute while maintaining accuracy on math benchmarks.

Reasoning language models perform well on complex tasks but are costly to deploy due to their size and long reasoning traces. We propose a routing approach that assigns each problem to the smallest model likely to solve it, reducing compute without sacrificing accuracy. Using intermediate representations from s1.1-32B, we train lightweight predictors of problem difficulty or model correctness to guide routing across a pool of reasoning models. On diverse math benchmarks, routing improves efficiency over random assignment and matches s1.1-32B's performance while using significantly less compute. Our results demonstrate that difficulty-aware routing is effective for cost-efficient deployment of reasoning models.

View on arXiv PDF

Similar