LGCVMLJul 21, 2020

Towards Visual Distortion in Black-Box Attacks

arXiv:2007.10593v217 citationsHas Code
AI Analysis

This addresses the challenge of creating less perceptible adversarial attacks for security testing, though it is incremental as it builds on existing black-box methods.

The paper tackles the problem of generating adversarial examples in black-box settings by minimizing visual distortion, achieving a 100% success rate on models like InceptionV3, ResNet50, and VGG16bn with lower distortion than state-of-the-art attacks.

Constructing adversarial examples in a black-box threat model injures the original images by introducing visual distortion. In this paper, we propose a novel black-box attack approach that can directly minimize the induced distortion by learning the noise distribution of the adversarial example, assuming only loss-oracle access to the black-box network. The quantified visual distortion, which measures the perceptual distance between the adversarial example and the original image, is introduced in our loss whilst the gradient of the corresponding non-differentiable loss function is approximated by sampling noise from the learned noise distribution. We validate the effectiveness of our attack on ImageNet. Our attack results in much lower distortion when compared to the state-of-the-art black-box attacks and achieves $100\%$ success rate on InceptionV3, ResNet50 and VGG16bn. The code is available at https://github.com/Alina-1997/visual-distortion-in-attack.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes