LGCRCVMay 31, 2023

Graph-based methods coupled with specific distributional distances for adversarial attack detection

arXiv:2306.00042v2
Originality Incremental advance
AI Analysis

This work addresses the security issue of adversarial attacks for neural network users, but it is incremental as it builds on existing detection and graph-based methods.

The paper tackles the problem of detecting adversarial attacks on neural networks by introducing a graph-based method that uses layer-wise relevance propagation to construct sparse graphs from input images and compares them to training data using Wasserstein distance and logistic regression, achieving strong detection results.

Artificial neural networks are prone to being fooled by carefully perturbed inputs which cause an egregious misclassification. These \textit{adversarial} attacks have been the focus of extensive research. Likewise, there has been an abundance of research in ways to detect and defend against them. We introduce a novel approach of detection and interpretation of adversarial attacks from a graph perspective. For an input image, we compute an associated sparse graph using the layer-wise relevance propagation algorithm \cite{bach15}. Specifically, we only keep edges of the neural network with the highest relevance values. Three quantities are then computed from the graph which are then compared against those computed from the training set. The result of the comparison is a classification of the image as benign or adversarial. To make the comparison, two classification methods are introduced: 1) an explicit formula based on Wasserstein distance applied to the degree of node and 2) a logistic regression. Both classification methods produce strong results which lead us to believe that a graph-based interpretation of adversarial attacks is valuable.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes