Method Drift›Tool use / function calling
Probe&Prefill
LLM Agents Already Know When to Call Tools -- Even Without ReasoningTool use / function calling · first seen May 10, 2026
superseded — cited as a baseline and beaten by newer methods
0 papers critique it · 1 beat it on benchmarks
Beaten on benchmarks
Head-to-head results where a newer method reports beating Probe&Prefill. Values are copied from the source paper's tables — verify against the cited paper.
- ASA: Activation Steering for Tool-Calling Domain Adaptation
ASA beats Probe&Prefill · Overall First Call Accuracy [NESTFUL evaluation]
41.94 vs 36.02
- ASA: Activation Steering for Tool-Calling Domain Adaptation
ASA beats Probe&Prefill · Overall Missing Tool Rate [NESTFUL evaluation]
6.72 vs 36.83
- ASA: Activation Steering for Tool-Calling Domain Adaptation
ASA beats Probe&Prefill · Overall Success [BFCL Multi-turn Prompt-mode]
38.75 vs 33.75
- ASA: Activation Steering for Tool-Calling Domain Adaptation
ASA beats Probe&Prefill · AST Accuracy [BFCL Single non-live]
95.60 vs 89.38
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Feb 4, 2026