CVDec 14, 2022

SAIF: Sparse Adversarial and Imperceptible Attack Framework

arXiv:2212.07495v42 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the vulnerability of image classifiers to adversarial attacks, which is a critical security issue for AI systems, but it is incremental as it builds on existing sparse attack methods.

The paper tackles the problem of adversarial attacks on neural networks by proposing SAIF, a framework that generates sparse, low-magnitude perturbations to deceive image classifiers, achieving state-of-the-art performance on ImageNet.

Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes