LG CR MLMay 8, 2020

Towards Robustness against Unsuspicious Adversarial Examples

Liang Tong, Minzhe Guo, Atul Prakash, Yevgeniy Vorobeychik

arXiv:2005.04272v21.2

Originality Incremental advance

AI Analysis

This addresses the challenge of robustness against unsuspicious adversarial attacks in machine learning, which is an incremental advance in adversarial robustness research.

The paper tackles the problem of adversarial examples that are unsuspicious rather than imperceptible, proposing a method based on cognitive salience to allow larger perturbations in image backgrounds while keeping them unsuspicious, and shows that attacks using this approach are highly effective against robust classifiers and adversarial training with them improves robustness to such attacks while maintaining comparable robustness to conventional ones.

Despite the remarkable success of deep neural networks, significant concerns have emerged about their robustness to adversarial perturbations to inputs. While most attacks aim to ensure that these are imperceptible, physical perturbation attacks typically aim for being unsuspicious, even if perceptible. However, there is no universal notion of what it means for adversarial examples to be unsuspicious. We propose an approach for modeling suspiciousness by leveraging cognitive salience. Specifically, we split an image into foreground (salient region) and background (the rest), and allow significantly larger adversarial perturbations in the background, while ensuring that cognitive salience of background remains low. We describe how to compute the resulting non-salience-preserving dual-perturbation attacks on classifiers. We then experimentally demonstrate that our attacks indeed do not significantly change perceptual salience of the background, but are highly effective against classifiers robust to conventional attacks. Furthermore, we show that adversarial training with dual-perturbation attacks yields classifiers that are more robust to these than state-of-the-art robust learning approaches, and comparable in terms of robustness to conventional attacks.

View on arXiv PDF

Similar