Rethinking Classifier and Adversarial Attack
This work addresses the need for more accurate adversarial robustness evaluation in machine learning security, though it appears incremental as it builds on existing attack methods.
The paper tackles the problem of overestimated adversarial robustness in defense models by proposing the Absolute Classification Boundaries Initialization (ACBI) method, which achieves lower robust accuracy across nearly 50 defense models, demonstrating improved attack effectiveness.
Various defense models have been proposed to resist adversarial attack algorithms, but existing adversarial robustness evaluation methods always overestimate the adversarial robustness of these models (i.e., not approaching the lower bound of robustness). To solve this problem, this paper uses the proposed decouple space method to divide the classifier into two parts: non-linear and linear. Then, this paper defines the representation vector of the original example (and its space, i.e., the representation space) and uses the iterative optimization of Absolute Classification Boundaries Initialization (ACBI) to obtain a better attack starting point. Particularly, this paper applies ACBI to nearly 50 widely-used defense models (including 8 architectures). Experimental results show that ACBI achieves lower robust accuracy in all cases.