LG AI CVOct 8, 2020

A Unified Approach to Interpreting and Boosting Adversarial Transferability

Xin Wang, Jie Ren, Shuyun Lin, Xiangming Zhu, Yisen Wang, Quanshi Zhang

arXiv:2010.04055v223.1117 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the challenge of making adversarial attacks more transferable across different models, which is crucial for security testing and robustness evaluation in machine learning.

The paper tackled the problem of adversarial transferability in deep neural networks by discovering and proving a negative correlation between transferability and interactions inside adversarial perturbations, and proposed a method to penalize these interactions, which significantly improved transferability.

In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.

View on arXiv PDF Code

Similar