Method Drift›Tool use / function calling
ReTool
ReTool: Reinforcement Learning for Strategic Tool Use in LLMsTool use / function calling · first seen Apr 15, 2025
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites ReTool as a baseline.
“While these works advance our understanding of exploration in language models, none simultaneously learn exploration policies that select among heterogeneous tools while rewarding both answer quality and trajectory diversity.”
— Step-wise Policy for Rare-tool Knowledge (SPaRK): Offline RL that Drives Diverse Tool Use in LLMs
Beaten on benchmarks
Head-to-head results where a newer method reports beating ReTool. Values are copied from the source paper's tables — verify against the cited paper.
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · GSM8K [0.6B student]
70.27 vs 68.85
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · MATH [0.6B student]
40.13 vs 39.65
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · WTQ [0.6B student]
25.68 vs 24.22
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · FinQA [0.6B student]
11.24 vs 10.55
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · EvalPlus [0.6B student]
28.47 vs 27.78
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · MBPP [0.6B student]
47.38 vs 46.30
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · GSM8K [1.7B student]
84.36 vs 82.33
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · MATH [1.7B student]
51.62 vs 50.49
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · WTQ [1.7B student]
38.05 vs 37.59
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · FinQA [1.7B student]
21.37 vs 20.09
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · EvalPlus [1.7B student]
47.29 vs 46.80
- CoCoDA: Co-evolving Compositional DAG for Tool-Augmented Agents
CoCoDA beats ReTool · MBPP [1.7B student]
63.85 vs 62.22
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 8, 2026