CVCRLGJun 24, 2020

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

arXiv:2006.13726v423 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses a critical problem in adversarial robustness evaluation for machine learning security researchers, though it is incremental as it builds on prior work on obfuscated gradients.

The paper identifies imbalanced gradients as a subtle cause of overestimated adversarial robustness in defense models, where gradient dominance leads to suboptimal attack directions, and proposes a Margin Decomposition attack that reduces robustness by over 1% in 11 out of 24 models tested.

Evaluating the robustness of a defense model is a challenging task in adversarial robustness research. Obfuscated gradients have previously been found to exist in many defense methods and cause a false signal of robustness. In this paper, we identify a more subtle situation called Imbalanced Gradients that can also cause overestimated adversarial robustness. The phenomenon of imbalanced gradients occurs when the gradient of one term of the margin loss dominates and pushes the attack towards to a suboptimal direction. To exploit imbalanced gradients, we formulate a Margin Decomposition (MD) attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately via a two-stage process. We also propose a multi-targeted and ensemble version of our MD attack. By investigating 24 defense models proposed since 2018, we find that 11 models are susceptible to a certain degree of imbalanced gradients and our MD attack can decrease their robustness evaluated by the best standalone baseline attack by more than 1%. We also provide an in-depth investigation on the likely causes of imbalanced gradients and effective countermeasures. Our code is available at https://github.com/HanxunH/MDAttack.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes