Adaptive Diffusion Denoised Smoothing : Certified Robustness via Randomized Smoothing with Differentially Private Guided Denoising Diffusion
This work addresses adversarial robustness for vision models, offering an incremental improvement over existing randomized smoothing methods.
The paper tackles the problem of certifying vision models against adversarial examples by proposing Adaptive Diffusion Denoised Smoothing, which uses guided denoising diffusion models as adaptive Gaussian Differentially Private mechanisms to refine noise into images, resulting in improved certified and standard accuracy on ImageNet for an ℓ₂ threat model.
We propose Adaptive Diffusion Denoised Smoothing, a method for certifying the predictions of a vision model against adversarial examples, while adapting to the input. Our key insight is to reinterpret a guided denoising diffusion model as a long sequence of adaptive Gaussian Differentially Private (GDP) mechanisms refining a pure noise sample into an image. We show that these adaptive mechanisms can be composed through a GDP privacy filter to analyze the end-to-end robustness of the guided denoising process, yielding a provable certification that extends the adaptive randomized smoothing analysis. We demonstrate that our design, under a specific guiding strategy, can improve both certified accuracy and standard accuracy on ImageNet for an $\ell_2$ threat model.