LGCVJun 23, 2022

InfoAT: Improving Adversarial Training Using the Information Bottleneck Principle

arXiv:2206.12292v123 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving adversarial robustness for machine learning models, particularly in security-critical applications, by focusing on hard examples, representing an incremental advancement in adversarial training techniques.

The paper tackles the problem of identifying hard examples in adversarial training by using the information bottleneck principle to measure mutual information, and the proposed InfoAT method achieves the best robustness across various datasets and models compared to state-of-the-art methods.

Adversarial training (AT) has shown excellent high performance in defending against adversarial examples. Recent studies demonstrate that examples are not equally important to the final robustness of models during AT, that is, the so-called hard examples that can be attacked easily exhibit more influence than robust examples on the final robustness. Therefore, guaranteeing the robustness of hard examples is crucial for improving the final robustness of the model. However, defining effective heuristics to search for hard examples is still difficult. In this article, inspired by the information bottleneck (IB) principle, we uncover that an example with high mutual information of the input and its associated latent representation is more likely to be attacked. Based on this observation, we propose a novel and effective adversarial training method (InfoAT). InfoAT is encouraged to find examples with high mutual information and exploit them efficiently to improve the final robustness of models. Experimental results show that InfoAT achieves the best robustness among different datasets and models in comparison with several state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes