CRAIApr 7

ClawLess: A Security Model of AI Agents

arXiv:2604.0628469.51 citations
Predicted impact top 21% in CR · last 90 daysOriginality Highly original
AI Analysis

This addresses security vulnerabilities in AI agents for developers and users, offering a novel approach beyond incremental training or prompting methods.

The paper tackles the security risks of autonomous AI agents by introducing ClawLess, a framework that enforces formally verified policies to protect against adversarial agents, achieving practical enforcement through a user-space kernel with BPF-based syscall interception.

Autonomous AI agents powered by Large Language Models can reason, plan, and execute complex tasks, but their ability to autonomously retrieve information and run code introduces significant security risks. Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. We present ClawLess, a security framework that enforces formally verified policies on AI agents under a worst-case threat model where the agent itself may be adversarial. ClawLess formalizes a fine-grained security model over system entities, trust scopes, and permissions to express dynamic policies that adapt to agents' runtime behavior. These policies are translated into concrete security rules and enforced through a user-space kernel augmented with BPF-based syscall interception. This approach bridges the formal security model with practical enforcement, ensuring security regardless of the agent's internal design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes