CVJun 22, 2023

Rethinking the Backward Propagation for Adversarial Transferability

arXiv:2306.12685v342 citationsh-index: 26Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating adversarial examples that transfer across black-box models, which is crucial for attacking real-world AI systems, though it is incremental as it builds on existing transfer-based attacks.

The paper tackles the problem of adversarial transferability by identifying that non-linear layers truncate gradients during backward propagation, which undermines transferability; it proposes the Backward Propagation Attack (BPA) method, which boosts transferability by up to 15% on ImageNet compared to baseline attacks.

Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e.g., ReLU, max-pooling, etc.) truncate the gradient during backward propagation, making the gradient w.r.t. input image imprecise to the loss function. We hypothesize and empirically validate that such truncation undermines the transferability of adversarial examples. Based on these findings, we propose a novel method called Backward Propagation Attack (BPA) to increase the relevance between the gradient w.r.t. input image and loss function so as to generate adversarial examples with higher transferability. Specifically, BPA adopts a non-monotonic function as the derivative of ReLU and incorporates softmax with temperature to smooth the derivative of max-pooling, thereby mitigating the information loss during the backward propagation of gradients. Empirical results on the ImageNet dataset demonstrate that not only does our method substantially boost the adversarial transferability, but it is also general to existing transfer-based attacks. Code is available at https://github.com/Trustworthy-AI-Group/RPA.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes