CRLGFeb 26, 2020

Defending against Backdoor Attack on Deep Neural Networks

arXiv:2002.12162v358 citations
AI Analysis

This work addresses security vulnerabilities in DNNs for computer vision applications, offering a defense mechanism against data poisoning attacks.

The paper tackles the problem of backdoor attacks on deep neural networks by analyzing their effects on neuron activation and proposing an ℓ∞-based neuron pruning method, which reduces attack success rates while maintaining high classification accuracy on clean images.

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called \textit{backdoor attack}, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the $\ell_\infty$ norm of an activation map compared to its $\ell_1$ and $\ell_2$ norm. Spurred by our results, we propose the \textit{$\ell_\infty$-based neuron pruning} to remove the backdoor from the backdoored DNN. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes