CEMay 23

No Certificate, No Execution: Certified Traces as a Foundation for Trustworthy AI Agents

arXiv:2605.2446257.5
Predicted impact top 14% in CE · last 90 daysOriginality Highly original
AI Analysis

For developers and deployers of AI agents in high-stakes domains, this provides a formal framework to ensure actions are permissible before execution, addressing a critical trust gap not covered by existing guardrails or audits.

The paper proposes a Proposal-Certification-Execution (PCE) architecture for trustworthy AI agents, where execution only occurs for traces certified as permissible under a policy system, formalized as 'no certificate, no execution'. It connects this to proof-carrying execution and zero-knowledge certificates, shifting evaluation from output accuracy to what traces can be safely certified.

We argue that trustworthy AI agents, especially in high-stakes and policy-governed domains, should make execution conditional on certified traces rather than rely only on stronger generative models, output-level guardrails, or post-hoc audits. A generative agent may propose recommendations, tool calls, reports, or actions, but generation is not permission: an action may be computable yet impermissible, and individually permissible actions may compose into an impermissible trace. We formalize trustworthy agency through a \textbf{Proposal--Certification--Execution (PCE)} architecture: a probabilistic generating machine $M_G$ proposes candidate execution traces, a \textbf{Permissibility Machine} $M_Π$ certifies proposed traces under a policy system $Π$, and execution proceeds only for certified traces. The executable trace language is $L_{\mathrm{exec}} = L_G \cap L_{\mathrm{cert}}(M_Π)$. Before execution, a trace is a structured pre-execution record submitted for certification: it specifies intended steps, evidence, proposed tool calls, approvals, replayable computations, credentials, and execution conditions. This perspective complements chain-of-thought monitorability: visible reasoning may help detect misbehavior, but monitorability is not certifiability, and reasoning is only one component of a broader execution trace. The formal principle is simple: an agent-generated trace should execute only when it carries a checkable certificate witnessing permissibility under $Π$: \textbf{no certificate, no execution}. We develop certified traces and Permissibility Machines as foundations for trustworthy AI agents, connect trace certification to proof-carrying execution, proof memory, privacy, and zero-knowledge certificates, and propose evaluating agents by what generated traces can be safely certified for execution, not by output accuracy alone.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes