SECRLGOct 27, 2025

Floating-Point Neural Network Verification at the Software Level

arXiv:2510.23389v12 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the need for safety-critical neural network deployment by providing a benchmark and evaluation for software-level verification, though it is incremental as it builds on existing verification tools and formats.

The authors tackled the problem of verifying neural network correctness at the software level by explicitly reasoning about floating-point implementations, constructing NeuroCodeBench 2.0 with 912 verification examples. Their evaluation showed that existing automated verifiers correctly solved an average of 11% of the benchmark while producing around 3% incorrect verdicts.

The behaviour of neural network components must be proven correct before deployment in safety-critical systems. Unfortunately, existing neural network verification techniques cannot certify the absence of faults at the software level. In this paper, we show how to specify and verify that neural networks are safe, by explicitly reasoning about their floating-point implementation. In doing so, we construct NeuroCodeBench 2.0, a benchmark comprising 912 neural network verification examples that cover activation functions, common layers, and full neural networks of up to 170K parameters. Our verification suite is written in plain C and is compatible with the format of the International Competition on Software Verification (SV-COMP). Thanks to it, we can conduct the first rigorous evaluation of eight state-of-the-art software verifiers on neural network code. The results show that existing automated verification tools can correctly solve an average of 11% of our benchmark, while producing around 3% incorrect verdicts. At the same time, a historical analysis reveals that the release of our benchmark has already had a significantly positive impact on the latter.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes