AIJan 7

ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition

arXiv:2601.03822v13 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient inference-time computation allocation for LLMs, which is an incremental improvement in optimizing resource usage for specific tasks.

The paper tackled the problem of LLMs not knowing how much computation a task requires under a strict global token constraint, formalizing it as an Ordered Stochastic Multiple-Choice Knapsack Problem, and proposed ROI-Reasoning, a two-stage framework that improved overall score and reduced regret on budgeted mathematical reasoning benchmarks.

Large language models (LLMs) can achieve strong reasoning performance with sufficient computation, but they do not inherently know how much computation a task requires. We study budgeted inference-time reasoning for multiple tasks under a strict global token constraint and formalize it as a Ordered Stochastic Multiple-Choice Knapsack Problem(OS-MCKP). This perspective highlights a meta-cognitive requirement -- anticipating task difficulty, estimating return over investment (ROI), and allocating computation strategically. We propose ROI-Reasoning, a two-stage framework that endows LLMs with intrinsic, budget-aware rationality. In the first stage, Meta-Cognitive Fine-Tuning teaches models to predict reasoning cost and expected utility before generation, enabling explicit solve-or-skip decisions. Next, Rationality-Aware Reinforcement Learning optimizes sequential decision making under a hard token budget, allowing models to learn long-horizon allocation strategies. Across budgeted mathematical reasoning benchmarks, ROI-Reasoning consistently improves overall score while substantially reducing regret under tight computation budgets.

View on arXiv PDF

Similar