CVDec 14, 2022

SAIF: Sparse Adversarial and Imperceptible Attack Framework

Tooba Imtiaz, Morgan Kohler, Jared Miller, Zifeng Wang, Masih Eskandar, Mario Sznaier, Octavia Camps, Jennifer Dy

arXiv:2212.07495v43.72 citationsh-index: 38

Originality Incremental advance

AI Analysis

This work addresses the vulnerability of image classifiers to adversarial attacks, which is a critical security issue for AI systems, but it is incremental as it builds on existing sparse attack methods.

The paper tackles the problem of adversarial attacks on neural networks by proposing SAIF, a framework that generates sparse, low-magnitude perturbations to deceive image classifiers, achieving state-of-the-art performance on ImageNet.

Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.

View on arXiv PDF

Similar