Method Drift›Tool use / function calling
Superseded baseline#38 of 55 most-superseded
LlamaFirewall
LlamaFirewall: An open source guardrail system for building secure AI agentsTool use / function calling · first seen May 6, 2025
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites LlamaFirewall as a baseline.
“Unlike existing guardrail framework such as LlamaFirewall, which abort tasks when prompt injection or unsafe behaviors are detected, our approach monitors each tool invocation in real time and provides feedback before execution, guiding safety-aware tool invocation reasoning in LLM-based agents.”
— ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.