LG CR CVJan 31, 2023

Salient Conditional Diffusion for Defending Against Backdoor Attacks

Brandon B. May, N. Joseph Tatro, Dylan Walker, Piyush Kumar, Nathan Shnidman

IBM

arXiv:2301.13862v210.711 citationsh-index: 10

Originality Incremental advance

AI Analysis

This addresses the security issue of backdoor attacks for machine learning practitioners, offering a novel defense method that is incremental in its approach by adapting diffusion models for this specific task.

The paper tackled the problem of defending against backdoor attacks in machine learning by proposing Salient Conditional Diffusion (Sancdifi), which uses a denoising diffusion probabilistic model with saliency map-based masks to diffuse out triggers in poisoned data while recovering salient features in clean data, achieving state-of-the-art performance as a black-box defense.

We propose a novel algorithm, Salient Conditional Diffusion (Sancdifi), a state-of-the-art defense against backdoor attacks. Sancdifi uses a denoising diffusion probabilistic model (DDPM) to degrade an image with noise and then recover said image using the learned reverse diffusion. Critically, we compute saliency map-based masks to condition our diffusion, allowing for stronger diffusion on the most salient pixels by the DDPM. As a result, Sancdifi is highly effective at diffusing out triggers in data poisoned by backdoor attacks. At the same time, it reliably recovers salient features when applied to clean data. This performance is achieved without requiring access to the model parameters of the Trojan network, meaning Sancdifi operates as a black-box defense.

View on arXiv PDF

Similar