LGDec 30, 2024

RobustBlack: Challenging Black-Box Adversarial Attacks on State-of-the-Art Defenses

arXiv:2412.20987v11 citationsh-index: 20
Originality Synthesis-oriented
AI Analysis

This work addresses a gap in adversarial robustness evaluation for researchers and practitioners, though it is incremental as it focuses on benchmarking existing attacks.

The paper tackled the problem of evaluating black-box adversarial attacks against robust models, finding that advanced attacks struggle against simple adversarially trained models and that robust models optimized for white-box attacks also resist black-box attacks.

Although adversarial robustness has been extensively studied in white-box settings, recent advances in black-box attacks (including transfer- and query-based approaches) are primarily benchmarked against weak defenses, leaving a significant gap in the evaluation of their effectiveness against more recent and moderate robust models (e.g., those featured in the Robustbench leaderboard). In this paper, we question this lack of attention from black-box attacks to robust models. We establish a framework to evaluate the effectiveness of recent black-box attacks against both top-performing and standard defense mechanisms, on the ImageNet dataset. Our empirical evaluation reveals the following key findings: (1) the most advanced black-box attacks struggle to succeed even against simple adversarially trained models; (2) robust models that are optimized to withstand strong white-box attacks, such as AutoAttack, also exhibits enhanced resilience against black-box attacks; and (3) robustness alignment between the surrogate models and the target model plays a key factor in the success rate of transfer-based attacks

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes