CR AIApr 27

A Comparative Evaluation of AI Agent Security Guardrails

Qi Li, Jiu Li, Pingtao Wei, Jianjun Xu, Xueyi Wei, Jiwei Shi, Xuan Zhang, Yanhui Yang, Xiaodong Hui, Peng Xu, Lingquan Zhou

arXiv:2604.2482670.6h-index: 2

Predicted impact top 22% in CR · last 90 daysOriginality Synthesis-oriented

AI Analysis

For developers deploying AI agents, this evaluation provides a benchmark of guardrail effectiveness, though it is a comparative product evaluation rather than a novel method.

This report compares DKnownAI Guard against three competing AI agent security guardrails, finding that DKnownAI Guard achieves the highest recall (96.5%) and true negative rate (90.4%), outperforming all others in detecting threats and harmful content.

This report presents a comparative evaluation of DKnownAI Guard in AI agent security scenarios, benchmarked against three competing products: AWS Bedrock Guardrails, Azure Content Safety, and Lakera Guard. Using human annotation as the ground truth, we assess each guardrail's ability to detect two categories of risks: threats to the agent itself (e.g., instruction override, indirect injection, tool abuse) and requests intended to elicit harmful content (e.g., hate speech, pornography, violence). Evaluation results demonstrate that DKnownAI Guard achieves the highest recall rate at 96.5\% and ranks first in true negative rate (TNR) at 90.4\%, delivering the best overall performance among all evaluated guardrails.

View on arXiv PDF

Similar