CVAug 12, 2020

Defending Adversarial Examples via DNN Bottleneck Reinforcement

arXiv:2008.05230v19 citations
Originality Highly original
AI Analysis

This addresses the security problem for DNN-based image classifiers by offering a novel defense method that is incremental in improving robustness without altering classifier structures.

The paper tackles the vulnerability of deep neural networks to adversarial attacks by proposing a DNN bottleneck reinforcement scheme that removes redundant information from latent representations, achieving strong defense on MNIST, CIFAR-10, and ImageNet datasets.

This paper presents a DNN bottleneck reinforcement scheme to alleviate the vulnerability of Deep Neural Networks (DNN) against adversarial attacks. Typical DNN classifiers encode the input image into a compressed latent representation more suitable for inference. This information bottleneck makes a trade-off between the image-specific structure and class-specific information in an image. By reinforcing the former while maintaining the latter, any redundant information, be it adversarial or not, should be removed from the latent representation. Hence, this paper proposes to jointly train an auto-encoder (AE) sharing the same encoding weights with the visual classifier. In order to reinforce the information bottleneck, we introduce the multi-scale low-pass objective and multi-scale high-frequency communication for better frequency steering in the network. Unlike existing approaches, our scheme is the first reforming defense per se which keeps the classifier structure untouched without appending any pre-processing head and is trained with clean images only. Extensive experiments on MNIST, CIFAR-10 and ImageNet demonstrate the strong defense of our method against various adversarial attacks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes