Improving the Transferability of Adversarial Examples by Feature Augmentation
This work addresses the challenge of adversarial example transferability for machine learning security, offering an incremental enhancement to existing methods.
The paper tackles the problem of low transferability of adversarial examples across different models by proposing a feature augmentation attack (FAUG) that injects random noise into intermediate features to diversify attack gradients, achieving improvements of +26.22% and +5.57% on specific attack types.
Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random noise into the intermediate features of the model to enlarge the diversity of the attack gradient, thereby mitigating the risk of overfitting to the specific model and notably amplifying adversarial transferability. Moreover, our method can be combined with existing gradient attacks to augment their performance further. Extensive experiments conducted on the ImageNet dataset across CNN and transformer models corroborate the efficacy of our method, e.g., we achieve improvement of +26.22% and +5.57% on input transformation-based attacks and combination methods, respectively.