Governed Reasoning for Institutional AI
For AI safety in high-stakes institutional decisions, this work introduces a governance-first architecture that eliminates silent errors, a critical problem for regulatory compliance and clinical triage.
The paper proposes Cognitive Core, a governed AI architecture for institutional decisions that uses typed cognitive primitives and a four-tier governance model with tamper-evident audit. On an 11-case prior authorization appeal benchmark, Cognitive Core achieves 91% accuracy vs. 55% (ReAct) and 45% (Plan-and-Solve), and produces zero silent errors vs. 5-6 for baselines.
Institutional decisions -- regulatory compliance, clinical triage, prior authorization appeal -- require a different AI architecture than general-purpose agents provide. Agent frameworks infer authority conversationally, reconstruct accountability from logs, and produce silent errors: incorrect determinations that execute without any human review signal. We propose Cognitive Core: a governed decision substrate built from nine typed cognitive primitives (retrieve, classify, investigate, verify, challenge, reflect, deliberate, govern, generate), a four-tier governance model where human review is a condition of execution rather than a post-hoc check, a tamper-evident SHA-256 hash-chain audit ledger endogenous to computation, and a demand-driven delegation architecture supporting both declared and autonomously reasoned epistemic sequences. We benchmark three systems on an 11-case balanced prior authorization appeal evaluation set. Cognitive Core achieves 91% accuracy against 55% (ReAct) and 45% (Plan-and-Solve). The governance result is more significant: CC produced zero silent errors while both baselines produced 5-6. We introduce governability -- how reliably a system knows when it should not act autonomously -- as a primary evaluation axis for institutional AI alongside accuracy. The baselines are implemented as prompts, representing the realistic deployment alternative to a governed framework. A configuration-driven domain model means deploying a new institutional decision domain requires YAML configuration, not engineering capacity.