AIFLLGNov 24, 2025

Extracting Robust Register Automata from Neural Networks over Data Sequences

arXiv:2511.19100v11 citations
Originality Highly original
AI Analysis

This work addresses the challenge of interpretability and formal reasoning for black-box neural models in continuous domains, offering a novel method that bridges these areas without requiring white-box access.

The authors tackled the problem of extracting interpretable automata from neural networks over continuous data sequences, which existing methods could not handle, and developed a framework for robust deterministic register automata (DRA) extraction with polynomial-time robustness checking and learning algorithms, achieving reliable automata learning and enabling principled robustness evaluation in experiments on recurrent and transformer networks.

Automata extraction is a method for synthesising interpretable surrogates for black-box neural models that can be analysed symbolically. Existing techniques assume a finite input alphabet, and thus are not directly applicable to data sequences drawn from continuous domains. We address this challenge with deterministic register automata (DRAs), which extend finite automata with registers that store and compare numeric values. Our main contribution is a framework for robust DRA extraction from black-box models: we develop a polynomial-time robustness checker for DRAs with a fixed number of registers, and combine it with passive and active automata learning algorithms. This combination yields surrogate DRAs with statistical robustness and equivalence guarantees. As a key application, we use the extracted automata to assess the robustness of neural networks: for a given sequence and distance metric, the DRA either certifies local robustness or produces a concrete counterexample. Experiments on recurrent neural networks and transformer architectures show that our framework reliably learns accurate automata and enables principled robustness evaluation. Overall, our results demonstrate that robust DRA extraction effectively bridges neural network interpretability and formal reasoning without requiring white-box access to the underlying network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes