Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option
This work addresses tool usage issues in large language models for tasks like numerical reasoning, but it is incremental as it builds directly on ToolkenGPT.
The paper tackled the problem of ToolkenGPT's inability to utilize tool documentation and frequent errors in deciding when to use tools, resulting in improved performance on multistep numerical reasoning and tool selection tasks.
The recently proposed ToolkenGPT tool learning paradigm demonstrates promising performance but suffers from two major issues: first, it cannot benefit from tool documentation, and second, it often makes mistakes in whether to use a tool at all. We introduce Toolken+ that mitigates the first problem by reranking top $k$ tools selected by ToolkenGPT and the second problem with a special "Reject" option such that the model will generate a vocabulary token if "Reject" is ranked first. We demonstrate the effectiveness of Toolken+ on multistep numerical reasoning and tool selection tasks.