CV IVJan 23, 2025

Gradient-Free Adversarial Purification with Diffusion Models

Xuelong Dai, Dong Wang, Xiuzhen Cheng, Bin Xiao

arXiv:2501.13336v26.22 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses the need for robust and efficient adversarial defenses in machine learning, though it appears incremental as it builds on existing purification and diffusion model techniques.

The paper tackles the problem of defending against both perturbation-based and unrestricted adversarial attacks by proposing a gradient-free adversarial purification framework that uses adversarial anti-aliasing and super-resolution, achieving efficient defense without retraining.

Adversarial training and adversarial purification are two widely used defense strategies for enhancing model robustness against adversarial attacks. However, adversarial training requires costly retraining, while adversarial purification often suffers from low efficiency. More critically, existing defenses are primarily designed under the perturbation-based adversarial threat model, which is ineffective against recently introduced unrestricted adversarial attacks. In this paper, we propose an effective and efficient defense framework that counters both perturbation-based and unrestricted adversarial attacks. Our approach is motivated by the observation that adversarial examples typically lie near the decision boundary and are highly sensitive to pixel-level perturbations. To address this, we introduce adversarial anti-aliasing, a preprocessing technique that mitigates adversarial noise by reducing the magnitude of pixel-level perturbations. In addition, we propose adversarial super-resolution, which leverages prior knowledge from clean datasets to benignly restore high-quality images from adversarially degraded ones. Unlike image synthesis methods that generate entirely new images, adversarial super-resolution focuses on image restoration, making it more suitable for purification. Importantly, both techniques require no additional training and are computationally efficient since they do not rely on gradient computations. To further improve robustness across diverse datasets, we introduce a contrastive learning-based adversarial deblurring fine-tuning method. By incorporating adversarial priors during fine-tuning on the target dataset, this method enhances purification effectiveness without the need to retrain diffusion models.

View on arXiv PDF

Similar