CV IVFeb 28, 2020

Detecting Patch Adversarial Attacks with Image Residuals

Marius Arvinte, Ahmed Tewfik, Sriram Vishwanath

arXiv:2002.12504v25.87 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in machine learning systems against localized adversarial attacks, but it is incremental as it builds on existing detection techniques.

The paper tackles the problem of detecting patch-based adversarial attacks on images by using image residuals from wavelet denoising as a fingerprint, and it shows that the method reduces attack success rates and increases computational effort for adaptive attackers.

We introduce an adversarial sample detection algorithm based on image residuals, specifically designed to guard against patch-based attacks. The image residual is obtained as the difference between an input image and a denoised version of it, and a discriminator is trained to distinguish between clean and adversarial samples. More precisely, we use a wavelet domain algorithm for denoising images and demonstrate that the obtained residuals act as a digital fingerprint for adversarial attacks. To emulate the limitations of a physical adversary, we evaluate the performance of our approach against localized (patch-based) adversarial attacks, including in settings where the adversary has complete knowledge about the detection scheme. Our results show that the proposed detection method generalizes to previously unseen, stronger attacks and that it is able to reduce the success rate (conversely, increase the computational effort) of an adaptive attacker.

View on arXiv PDF Code

Similar