CRAICVDec 18, 2023

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models

arXiv:2312.11057v218 citationsh-index: 11AAAI
Originality Highly original
AI Analysis

This addresses the security issue of backdoor attacks in machine learning for practitioners, offering a novel defense method against evolving triggers.

The paper tackles the problem of dataset sanitization against poisoning-based backdoor attacks by proposing DataElixir, which uses diffusion models to purify poisoned samples into benign ones, effectively mitigating 9 popular attacks with minimal impact on benign accuracy.

Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets. We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones. Specifically, with multiple iterations of the forward and reverse process, we extract intermediary images and their predicted labels for each sample in the original dataset. Then, we identify anomalous samples in terms of the presence of label transition of the intermediary images, detect the target label by quantifying distribution discrepancy, select their purified images considering pixel and feature distance, and determine their ground-truth labels by training a benign model. Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy, surpassing the performance of baseline defense methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes