CYApr 29

Decision Evidence Maturity Model for Agentic AI: A Property-Level Method Specification

arXiv:2605.0409344.6h-index: 1Has Code

AI Analysis

This work addresses the need for auditable decision evidence in agentic AI systems, but it is an incremental methodological specification without empirical validation of real-world impact.

The paper identifies the 'container fallacy' in agentic AI—where evidence containers exist but lack sufficient property-level detail for governance queries—and proposes the Decision Evidence Maturity Model (DEMM) to classify evidence sufficiency into four categories plus a conflicting category, with a five-level capability rubric. A feasibility exercise on 140 synthetic scenarios and three public incidents shows completeness ranging from 53.6% to 100%.

Agentic AI systems produce decision evidence at scale through execution telemetry, but property-level reconstruction often fails when an external party asks a specific governance question about a specific decision: the assembled evidence is insufficient to answer it. We name this pattern the container fallacy: the automatic equation of evidence-container presence with audit sufficiency. This paper specifies the Decision Evidence Maturity Model (DEMM), a property-level reconstructability method for agentic decisions. DEMM classifies evidence sufficiency into four executable categories plus a protocol-level "conflicting" category and aggregates per-property verdicts into a five-level capability rubric anchored to the established maturity-model lineage. The open-source Decision Trace Reconstructor ships ten executable adapter-fallback classes spanning vendor SDKs, protocol traces, public-postmortem prose, and generic JSONL records. A reproducible feasibility exercise runs the protocol on 140 synthetic scenarios plus three public incidents; the resulting completeness range (53.6% to 100%) is implementation behaviour, not external validation.

View on arXiv PDF

Similar