LGCVMLJul 7, 2020

Regional Image Perturbation Reduces $L_p$ Norms of Adversarial Examples While Maintaining Model-to-model Transferability

arXiv:2007.03198v22 citations
AI Analysis

This work addresses adversarial attacks in machine learning, offering a simpler method that could undermine defenses relying on L_p norm robustness, though it is incremental as it builds on existing loss functions.

The paper tackles the problem of generating effective regional adversarial perturbations without complex methods, showing that a simple approach using cross-entropy sign maintains 76% transferability on ImageNet while reducing L_p norm distortion.

Regional adversarial attacks often rely on complicated methods for generating adversarial perturbations, making it hard to compare their efficacy against well-known attacks. In this study, we show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign, one of the most commonly used losses in adversarial machine learning. Our experiments on ImageNet with multiple models reveal that, on average, $76\%$ of the generated adversarial examples maintain model-to-model transferability when the perturbation is applied to local image regions. Depending on the selected region, these localized adversarial examples require significantly less $L_p$ norm distortion (for $p \in \{0, 2, \infty\}$) compared to their non-local counterparts. These localized attacks therefore have the potential to undermine defenses that claim robustness under the aforementioned norms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes