LG CVNov 25, 2024

Scaling Laws for Black box Adversarial Attacks

Chuan Liu, Huanran Chen, Yichi Zhang, Yinpeng Dong, Jun Zhu

arXiv:2411.16782v314.29 citationsh-index: 41

Originality Incremental advance

AI Analysis

This work addresses the threat of black-box adversarial attacks in commercial settings, showing incremental improvements through model scaling.

The paper investigates whether increasing the number of surrogate models in ensemble-based adversarial attacks improves transferability to black-box models, finding clear scaling laws that enhance attack success rates, achieving over 90% on proprietary models like GPT-4o.

Adversarial examples usually exhibit good cross-model transferability, enabling attacks on black-box models with limited information about their architectures and parameters, which are highly threatening in commercial black-box scenarios. Model ensembling is an effective strategy to improve the transferability of adversarial examples by attacking multiple surrogate models. However, since prior studies usually adopt few models in the ensemble, there remains an open question of whether scaling the number of models can further improve black-box attacks. Inspired by the scaling law of large foundation models, we investigate the scaling laws of black-box adversarial attacks in this work. Through theoretical analysis and empirical evaluations, we conclude with clear scaling laws that using more surrogate models enhances adversarial transferability. Comprehensive experiments verify the claims on standard image classifiers, diverse defended models and multimodal large language models using various adversarial attack methods. Specifically, by scaling law, we achieve 90%+ transfer attack success rate on even proprietary models like GPT-4o. Further visualization indicates that there is also a scaling law on the interpretability and semantics of adversarial perturbations.

View on arXiv PDF

Similar