CVCRLGMar 16, 2023

Robust Evaluation of Diffusion-Based Adversarial Purification

arXiv:2303.09051v397 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses evaluation issues for adversarial purification in machine learning security, offering incremental improvements.

The authors identified a flaw in the evaluation of diffusion-based adversarial purification methods, showing that current attacks are not optimal, and proposed a new purification strategy that improves robustness.

We question the current evaluation practice on diffusion-based purification methods. Diffusion-based purification methods aim to remove adversarial effects from an input data point at test time. The approach gains increasing attention as an alternative to adversarial training due to the disentangling between training and testing. Well-known white-box attacks are often employed to measure the robustness of the purification. However, it is unknown whether these attacks are the most effective for the diffusion-based purification since the attacks are often tailored for adversarial training. We analyze the current practices and provide a new guideline for measuring the robustness of purification methods against adversarial attacks. Based on our analysis, we further propose a new purification strategy improving robustness compared to the current diffusion-based purification methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes