An Adversarial Robustness Perspective on the Topology of Neural Networks
This work addresses adversarial vulnerabilities in neural networks for security-critical applications, offering a detection method based on topological analysis, though it is incremental as it builds on existing robustness research.
The paper tackled the problem of adversarial robustness in neural networks by analyzing their topology, finding that graphs from clean inputs are centralized around highway edges while adversarial ones are diffuse, leveraging under-optimized edges, and demonstrated that these edges can detect adversarial inputs with experimental validation across datasets and architectures.
In this paper, we investigate the impact of neural networks (NNs) topology on adversarial robustness. Specifically, we study the graph produced when an input traverses all the layers of a NN, and show that such graphs are different for clean and adversarial inputs. We find that graphs from clean inputs are more centralized around highway edges, whereas those from adversaries are more diffuse, leveraging under-optimized edges. Through experiments on a variety of datasets and architectures, we show that these under-optimized edges are a source of adversarial vulnerability and that they can be used to detect adversarial inputs.