CR LGMay 29

Bit-Exact AI Inference Verification Without Performance Tradeoffs

arXiv:2606.002798.1

Predicted impact top 56% in CR · last 90 daysOriginality Incremental advance

AI Analysis

For AI governance auditors, this enables verifiable monitoring of covert adversaries without accepting approximate matches.

The paper shows that bit-exact verification of AI inference is possible without performance tradeoffs by exploiting deterministic but non-invariant outputs from modern inference engines, enabling auditable signatures of software and hardware setups.

Verifying claims about AI workloads is a pre- requisite for credible AI governance of covert adversaries (who comply with monitoring only when detection likelihood is high), yet the ap- parent non-determinism of GPU floating-point arithmetic forces auditors to accept approximate output matches. Covert adversaries can exploit un- verifiable degrees of freedom in monitored compu- tation. Attack vectors include steganography, un- reported modification of inference software, and covert computation via unreported batch elements. Empirically, we analyze how modern inference engines (vLLM, HF transformers) produce deter- ministic but non-invariant outputs, without need- ing to set performance-compromising determin- ism flags, if the right information is available for re-computation and no atomic functions are called in the backend. We demonstrate that such bitwise- precise re-computation does not require access to identical hardware, via a software-only emula- tion of LLM inference across multiple NVIDIA GPU variants. Thus, accumulated rounding errors can be an auditable signature of the software and hardware setup used for inference, instead of a constraint on verifiability.

View on arXiv PDF

Similar