Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
This addresses a specific bottleneck in tool-integrated reasoning for complex math problems, offering incremental improvements over prior methods.
The paper tackles the problem of tool-integrated reasoning in large reasoning models by focusing on how tools are applied, not just when, identifying misaligned patterns like calculator and algorithmic uses that cause failures. It proposes a two-stage framework that improves code usage and accuracy, raising Code@1 on MATH500 from 64.0% to 70.5% and on AIME24 from 26.7% to 50.0%.
Tool-integrated reasoning (TIR) has become a key approach for improving large reasoning models (LRMs) on complex problems. Prior work has mainly studied when to invoke tools, while overlooking how tools are applied. We identify two common patterns: a calculator pattern that uses code for direct computation, and an algorithmic pattern that encodes problems as programs. Misaligned choices often cause failures even when reasoning is sound. We propose a two-stage framework that first builds code competence from both patterns and then aligns pattern selection with teacher preferences. Across challenging math datasets, our pattern-aware method substantially improves both code usage and accuracy, for instance raising Code@1 on MATH500 from 64.0% to 70.5% and on AIME24 from 26.7% to 50.0%. These gains highlight the effectiveness of a pattern-aware approach for tool-integrated reasoning.