AIFeb 2

Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning

Xintian Shen, Jiawei Chen, Lihao Zheng, Hao Ma, Tao Wei, Kun Zhan

arXiv:2602.01983v19.94 citationsh-index: 1

Originality Highly original

AI Analysis

This addresses the need for adaptive tool creation in multimodal reasoning systems, offering a novel paradigm that is not incremental but introduces a new approach to self-optimization.

The paper tackles the problem of fixed tools in Tool-Integrated Reasoning models failing in open-ended scenarios and proposes UCT, a training-free framework that enables agents to create and update tools automatically during inference, resulting in performance gains of +20.86% and +23.04% on multi-domain reasoning benchmarks.

Existing Tool-Integrated Reasoning (TIR) models have effectively extended the question-answering capabilities of LLMs by incorporating external tools. However, real-world scenarios present numerous open-ended problems where fixed tools often fail to meet task requirements. Furthermore, the lack of self-optimization mechanisms means that erroneous tool outputs can mislead the LLM's responses. Additionally, the construction of existing tools entails significant manual effort, which consequently constrains their applicability. Recognizing that the reasoning traces of LLMs encapsulate implicit problem-solving capabilities, we propose UCT, a novel training-free framework that transforms agents from tool users to tool creators. This approach harvests reasoning experiences and distills them into reusable assets. This method transforms the agent from a mere tool user into a tool creator, enabling adaptive tool creation and self-updating during the inference process. We also introduce a memory consolidation mechanism to maintain the tool library, ensuring high reusability of retained experiential memory for subsequent reasoning tasks. This novel automated tool construction paradigm continuously improves tool quality during reasoning, allowing the overall agent system to progress without additional training. Extensive experiments demonstrate that our method serves as a novel paradigm for enhancing the capabilities of TIR models. In particular, the significant performance gains achieved +20.86%$\uparrow$ and +23.04%$\uparrow$ on benchmarks across multi-domain mathematical and scientific reasoning tasks validate the self-evolving capability of the agent.

View on arXiv PDF

Similar