LGFeb 15, 2021

Data Quality Matters For Adversarial Training: An Empirical Study

arXiv:2102.07437v312 citations
Originality Incremental advance
AI Analysis

This addresses reliability challenges in adversarial training for machine learning practitioners, though it is incremental in nature.

The study identifies low-quality data samples as a common cause of issues in adversarial training, such as robust overfitting and robustness overestimation, and shows that removing these samples alleviates these problems and reduces the robustness-accuracy trade-off.

Multiple intriguing problems are hovering in adversarial training, including robust overfitting, robustness overestimation, and robustness-accuracy trade-off. These problems pose great challenges to both reliable evaluation and practical deployment. Here, we empirically show that these problems share one common cause -- low-quality samples in the dataset. Specifically, we first propose a strategy to measure the data quality based on the learning behaviors of the data during adversarial training and find that low-quality data may not be useful and even detrimental to the adversarial robustness. We then design controlled experiments to investigate the interconnections between data quality and problems in adversarial training. We find that when low-quality data is removed, robust overfitting and robustness overestimation can be largely alleviated; and robustness-accuracy trade-off becomes less significant. These observations not only verify our intuition about data quality but may also open new opportunities to advance adversarial training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes