Denoising Diffusion Probabilistic Models as a Defense against Adversarial Attacks
This addresses the security issue of adversarial attacks for neural network users, but it is incremental as it applies an existing method to a new defense context.
The paper tackles the problem of neural networks' vulnerability to adversarial attacks by using Denoising Diffusion Probabilistic Models (DDPM) as a purification technique, achieving an improvement in robust accuracy by up to 88% of the original model's accuracy on the PatchCamelyon dataset.
Neural Networks are infamously sensitive to small perturbations in their inputs, making them vulnerable to adversarial attacks. This project evaluates the performance of Denoising Diffusion Probabilistic Models (DDPM) as a purification technique to defend against adversarial attacks. This works by adding noise to an adversarial example before removing it through the reverse process of the diffusion model. We evaluate the approach on the PatchCamelyon data set for histopathologic scans of lymph node sections and find an improvement of the robust accuracy by up to 88\% of the original model's accuracy, constituting a considerable improvement over the vanilla model and our baselines. The project code is located at https://github.com/ankile/Adversarial-Diffusion.