ROMay 27

ICAN-Deploy: Identity-Stable Canary Deployment for Safety-Critical Embodied Agents

arXiv:2605.2809760.5
AI Analysis

For developers of safety-critical embodied agents (e.g., LLM-driven robots), this enables certified agents to ship capability updates without re-certification, eliminating a key bottleneck in continuous deployment.

ICAN-Deploy solves the identity drift problem in canary deployments for safety-critical embodied agents by maintaining a cryptographic identity hash invariant across the canary window, enabling zero-drift deployment with entry latency of 1.52–2.01 ms (95% BCa CI) over 100 real canary cycles on a Franka Panda arm in MuJoCo.

Canary deployment routes a fraction of traffic to a new software version, monitors metrics, and rolls back on regression. Mainstream controllers (Argo Rollouts, Spinnaker, Flagger) change the deployed system's cryptographic identity during the canary window. The drift is harmless for stateless microservices but breaks the claim that "the agent you certified is still the agent you have" for safety-critical embodied agents, forcing re-certification per canary. We present ICAN-Deploy (Identity-stable CANary Deployment), a middleware construction whose state machine holds the identity hash invariant across the canary window by separating capability names (frozen, hashed) from capability versions (mutable runtime state). We implement ICAN-Deploy inside a runtime governance layer for LLM-driven robots and verify invariance by closed-form proof, AST lint, and TLA+ model-checking, then corroborate over N=100 real canary cycles on a Franka Panda arm in MuJoCo (zero drift; entry latency 95% BCa CI [1.52, 2.01] ms). A feature-flagged strawman that folds versions into the manifest falsifies on the same workload. A system certified once at identity-creation time can then ship arbitrary capability evolution under that same certification, within the version-and-name envelope.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes