Method Drift›Tool use / function calling
Superseded baseline#11 of 55 most-superseded
ART
ART: Automatic multi-step reasoning and tool-use for large language modelsTool use / function calling · first seen Mar 16, 2023
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites ART as a baseline.
“Existing methods~paranjape2023art,guan2025deeprag typically rely on coarse-grained retrieval strategies, which fail to account for the nuanced relationships between tool usage patterns and user objectives in multi-step function calling.”
— Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall
Beaten on benchmarks
Head-to-head results where a newer method reports beating ART. Values are copied from the source paper's tables — verify against the cited paper.
- Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall
SEER (Ours) beats ART · Accuracy [Easy Questions]
67.9 vs 57.1
- Self-Guided Function Calling in Large Language Models via Stepwise Experience Recall
SEER (Ours) beats ART · Accuracy [Hard Questions]
31.1 vs 23.7