LGAICVFeb 9, 2023

IB-RAR: Information Bottleneck as Regularizer for Adversarial Robustness

arXiv:2302.10896v22 citationsh-index: 39
AI Analysis

This addresses the problem of adversarial vulnerability in machine learning models, particularly for security-critical applications, with an incremental approach building on existing adversarial training methods.

The paper tackles improving adversarial robustness in neural networks by using Information Bottleneck as a regularizer, resulting in an average accuracy improvement of 3.07% against five adversarial attacks for VGG16 on CIFAR-10.

In this paper, we propose a novel method, IB-RAR, which uses Information Bottleneck (IB) to strengthen adversarial robustness for both adversarial training and non-adversarial-trained methods. We first use the IB theory to build regularizers as learning objectives in the loss function. Then, we filter out unnecessary features of intermediate representation according to their mutual information (MI) with labels, as the network trained with IB provides easily distinguishable MI for its features. Experimental results show that our method can be naturally combined with adversarial training and provides consistently better accuracy on new adversarial examples. Our method improves the accuracy by an average of 3.07% against five adversarial attacks for the VGG16 network, trained with three adversarial training benchmarks and the CIFAR-10 dataset. In addition, our method also provides good robustness for undefended methods, such as training with cross-entropy loss only. Finally, in the absence of adversarial training, the VGG16 network trained using our method and the CIFAR-10 dataset reaches an accuracy of 35.86% against PGD examples, while using all layers reaches 25.61% accuracy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes