CV LGJun 1

Sensitivity as a Double-Edged Sword: A Trade-off Between Discriminability and Adversarial Robustness

arXiv:2606.0174616.4

AI Analysis

For practitioners of adversarial robustness, this work provides a lightweight plug-and-play module that improves robustness of existing adversarially trained models, though the improvement is incremental over current SOTA.

The paper identifies a trade-off between discriminability and adversarial robustness in neural networks, attributing vulnerability to the sensitivity of fully connected classifiers. It proposes a Hybrid Prototype Mixing (HPM) framework with an ℓ2-reclassifier that enhances robustness while maintaining discriminative power, achieving improved adversarial robustness on various SOTA models.

Modern neural networks are highly susceptible to adversarial perturbations. In this work, we identify that part of this vulnerability stems from the sensitivity of the widely used fully connected (FC) classifiers to such perturbations. In contrast, simple $\ell_2$ distance-based classifiers exhibit significantly greater robustness. We provide thorough theoretical and empirical analysis showing that while FC classifiers' high sensitivity makes them discriminative, it also makes them vulnerable. Conversely, $\ell_2$-classifiers' insensitivity grants robustness but limits performance. Motivated by this trade-off, we propose a novel $\ell_2$-reclassifier based on a Hybrid Prototype Mixing (HPM) framework. This method retains the discriminative power of FC classifiers while leveraging the robustness of $\ell_2$ distance. It yields $\ell_2$-distance-based predictions by fusing two prototype types: (1) stable, dataset-level prototypes updated via EMA, and (2) dynamic, batch-level prototypes generated from the FC classifier's predictions using a Straight-Through Estimator (STE). However, this dynamic, STE-based architecture introduces significant challenges for evaluation, such as gradient obfuscation and forward discontinuity. To address this, we propose a new, rigorous evaluation protocol, the Mixed Surrogate Attack (MSA), which uses multiple surrogates along with powerful AutoAttack to ensure a fair and robust assessment. Extensive experiments demonstrate that our lightweight, plug-and-play module, with minimal fine-tuning, effectively enhances the adversarial robustness of various existing SOTA adversarially trained models.

View on arXiv PDF

Similar