AIApr 7

Auditable Agents

arXiv:2604.0548599.01 citationsh-index: 6Has Code
Predicted impact top 2% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This addresses the need for accountability in deployed AI agents, which is crucial for safety and compliance, though it is incremental in proposing a framework rather than a novel method.

The paper tackles the problem of ensuring accountability in LLM agent systems by defining auditability as a necessary system property, supported by evidence including 617 security findings in open-source projects, 8.3 ms median overhead for mediation, and partial recovery of responsibility information.

LLM agents call tools, query databases, delegate tasks, and trigger external side effects. Once an agent system can act in the world, the question is no longer only whether harmful actions can be prevented--it is whether those actions remain answerable after deployment. We distinguish accountability (the ability to determine compliance and assign responsibility), auditability (the system property that makes accountability possible), and auditing (the process of reconstructing behavior from trustworthy evidence). Our claim is direct: no agent system can be accountable without auditability. To make this operational, we define five dimensions of agent auditability, i.e., action recoverability, lifecycle coverage, policy checkability, responsibility attribution, and evidence integrity, and identify three mechanism classes (detect, enforce, recover) whose temporal information-and-intervention constraints explain why, in practice, no single approach suffices. We support the position with layered evidence rather than a single benchmark: lower-bound ecosystem measurements suggest that even basic security prerequisites for auditability are widely unmet (617 security findings across six prominent open-source projects); runtime feasibility results show that pre-execution mediation with tamper-evident records adds only 8.3 ms median overhead; and controlled recovery experiments show that responsibility-relevant information can be partially recovered even when conventional logs are missing. We propose an Auditability Card for agent systems and identify six open research problems organized by mechanism class.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes