How to choose your best allies for a transferable attack?
This work addresses a key security issue in deep neural networks by improving the efficiency and effectiveness of adversarial attacks, though it is incremental in nature.
The paper tackles the problem of evaluating and selecting source models for transferable adversarial attacks, showing that random selection can perform worse than black-box attacks, and proposes FiT, a selection mechanism that effectively chooses the best source model with few queries.
The transferability of adversarial examples is a key issue in the security of deep neural networks. The possibility of an adversarial example crafted for a source model fooling another targeted model makes the threat of adversarial attacks more realistic. Measuring transferability is a crucial problem, but the Attack Success Rate alone does not provide a sound evaluation. This paper proposes a new methodology for evaluating transferability by putting distortion in a central position. This new tool shows that transferable attacks may perform far worse than a black box attack if the attacker randomly picks the source model. To address this issue, we propose a new selection mechanism, called FiT, which aims at choosing the best source model with only a few preliminary queries to the target. Our experimental results show that FiT is highly effective at selecting the best source model for multiple scenarios such as single-model attacks, ensemble-model attacks and multiple attacks (Code available at: https://github.com/t-maho/transferability_measure_fit).