CVApr 15, 2025

Defending Against Frequency-Based Attacks with Diffusion Models

arXiv:2504.11034v16.21 citationsh-index: 22025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Originality Synthesis-oriented

AI Analysis

This work addresses the generalization challenge in adversarial defense for machine learning models, offering a method to handle unseen attack types, though it appears incremental by extending existing purification techniques to new attack domains.

The study tackled the problem of adversarial attacks by exploring adversarial purification with diffusion models, broadening beyond pixel-wise robustness to include spectral and spatial attacks, and found it effective in handling diverse distortion patterns across frequency regions.

Adversarial training is a common strategy for enhancing model robustness against adversarial attacks. However, it is typically tailored to the specific attack types it is trained on, limiting its ability to generalize to unseen threat models. Adversarial purification offers an alternative by leveraging a generative model to remove perturbations before classification. Since the purifier is trained independently of both the classifier and the threat models, it is better equipped to handle previously unseen attack scenarios. Diffusion models have proven highly effective for noise purification, not only in countering pixel-wise adversarial perturbations but also in addressing non-adversarial data shifts. In this study, we broaden the focus beyond pixel-wise robustness to explore the extent to which purification can mitigate both spectral and spatial adversarial attacks. Our findings highlight its effectiveness in handling diverse distortion patterns across low- to high-frequency regions.

View on arXiv PDF

Similar