CVAug 5, 2023

An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability

arXiv:2308.02897v199 citationsh-index: 31
Originality Incremental advance
AI Analysis

This work addresses a key limitation in adversarial machine learning for security applications, offering an incremental enhancement to model ensemble attacks.

The paper tackles the problem of low transferability of adversarial examples across models with different architectures, such as from CNNs to ViTs, by proposing an adaptive ensemble attack called AdaEA that adjusts the fusion of surrogate model outputs based on their contributions, achieving considerable improvement over existing methods on various datasets.

While the transferability property of adversarial examples allows the adversary to perform black-box attacks (i.e., the attacker has no knowledge about the target model), the transfer-based adversarial attacks have gained great attention. Previous works mostly study gradient variation or image transformations to amplify the distortion on critical parts of inputs. These methods can work on transferring across models with limited differences, i.e., from CNNs to CNNs, but always fail in transferring across models with wide differences, such as from CNNs to ViTs. Alternatively, model ensemble adversarial attacks are proposed to fuse outputs from surrogate models with diverse architectures to get an ensemble loss, making the generated adversarial example more likely to transfer to other models as it can fool multiple models concurrently. However, existing ensemble attacks simply fuse the outputs of the surrogate models evenly, thus are not efficacious to capture and amplify the intrinsic transfer information of adversarial examples. In this paper, we propose an adaptive ensemble attack, dubbed AdaEA, to adaptively control the fusion of the outputs from each model, via monitoring the discrepancy ratio of their contributions towards the adversarial objective. Furthermore, an extra disparity-reduced filter is introduced to further synchronize the update direction. As a result, we achieve considerable improvement over the existing ensemble attacks on various datasets, and the proposed AdaEA can also boost existing transfer-based attacks, which further demonstrates its efficacy and versatility.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes