Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks
This work addresses the problem of targeted adversarial attacks for evaluating model robustness in black-box scenarios, representing an incremental improvement over existing methods.
The paper tackles the challenge of efficiently producing targeted transfer-based adversarial examples in black-box settings, and achieves an average success rate of 29.1% against six diverse models, significantly outperforming state-of-the-art gradient-based methods while being more efficient by an order of magnitude.
Transfer-based adversarial attacks can evaluate model robustness in the black-box setting. Several methods have demonstrated impressive untargeted transferability, however, it is still challenging to efficiently produce targeted transferability. To this end, we develop a simple yet effective framework to craft targeted transfer-based adversarial examples, applying a hierarchical generative network. In particular, we contribute to amortized designs that well adapt to multi-class targeted attacks. Extensive experiments on ImageNet show that our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods -- it reaches an average success rate of 29.1\% against six diverse models based only on one substitute white-box model, which significantly outperforms the state-of-the-art gradient-based attack methods. Moreover, the proposed method is also more efficient beyond an order of magnitude than gradient-based methods.