LG CV MLJul 7, 2020

Regional Image Perturbation Reduces $L_p$ Norms of Adversarial Examples While Maintaining Model-to-model Transferability

Utku Ozbulak, Jonathan Peck, Wesley De Neve, Bart Goossens, Yvan Saeys, Arnout Van Messem

arXiv:2007.03198v22.32 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses adversarial attacks in machine learning, offering a simpler method that could undermine defenses relying on L_p norm robustness, though it is incremental as it builds on existing loss functions.

The paper tackles the problem of generating effective regional adversarial perturbations without complex methods, showing that a simple approach using cross-entropy sign maintains 76% transferability on ImageNet while reducing L_p norm distortion.

Regional adversarial attacks often rely on complicated methods for generating adversarial perturbations, making it hard to compare their efficacy against well-known attacks. In this study, we show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign, one of the most commonly used losses in adversarial machine learning. Our experiments on ImageNet with multiple models reveal that, on average, $76\%$ of the generated adversarial examples maintain model-to-model transferability when the perturbation is applied to local image regions. Depending on the selected region, these localized adversarial examples require significantly less $L_p$ norm distortion (for $p \in \{0, 2, \infty\}$) compared to their non-local counterparts. These localized attacks therefore have the potential to undermine defenses that claim robustness under the aforementioned norms.

View on arXiv PDF Code

Similar