A Symbolic Neural Network Representation and its Application to Understanding, Verifying, and Patching Networks
This work addresses the challenge of understanding, verifying, and patching neural networks, particularly for safety-critical domains like aircraft systems, though it appears incremental as it builds on existing symbolic methods for network analysis.
The authors tackled the problem of analyzing and manipulating trained neural networks by proposing a symbolic representation for piecewise-linear networks, enabling translation into finite sets of affine functions. They demonstrated its application to computing weakest preconditions for visualizing advisories in an aircraft collision avoidance system, strongest postconditions for bounded model checking on neural network controllers, and patching to correct user-specified behaviors.
Analysis and manipulation of trained neural networks is a challenging and important problem. We propose a symbolic representation for piecewise-linear neural networks and discuss its efficient computation. With this representation, one can translate the problem of analyzing a complex neural network into that of analyzing a finite set of affine functions. We demonstrate the use of this representation for three applications. First, we apply the symbolic representation to computing weakest preconditions on network inputs, which we use to exactly visualize the advisories made by a network meant to operate an aircraft collision avoidance system. Second, we use the symbolic representation to compute strongest postconditions on the network outputs, which we use to perform bounded model checking on standard neural network controllers. Finally, we show how the symbolic representation can be combined with a new form of neural network to perform patching; i.e., correct user-specified behavior of the network.