CRAIJul 27, 2024

Clean-Label Physical Backdoor Attacks with Data Distillation

arXiv:2407.19203v44 citationsh-index: 3Has Code
Originality Highly original
AI Analysis

This addresses the problem of stealthy real-world backdoor attacks for attackers, presenting a novel paradigm rather than incremental improvements.

The paper tackles the stealthiness limitation of physical backdoor attacks by introducing Clean-Label Physical Backdoor Attack (CLPBA), which avoids label manipulation and trigger injection during training, instead using imperceptible perturbations on target class samples. The method achieves backdoor attack success rates that surpass dirty-label baselines in hard physical-world scenarios, with experiments on facial recognition and animal classification datasets.

Deep Neural Networks (DNNs) are shown to be vulnerable to backdoor poisoning attacks, with most research focusing on digital triggers -- artificial patterns added to test-time inputs to induce targeted misclassification. Physical triggers, which are natural objects embedded in real-world scenes, offer a promising alternative for attackers, as they can activate backdoors in real-time without digital manipulation. However, existing physical backdoor attacks are dirty-label, meaning that attackers must change the labels of poisoned inputs to the target label. The inconsistency between image content and label exposes the attack to human inspection, reducing its stealthiness in real-world settings. To address this limitation, we introduce Clean-Label Physical Backdoor Attack (CLPBA), a new paradigm of physical backdoor attack that does not require label manipulation and trigger injection at the training stage. Instead, the attacker injects imperceptible perturbations into a small number of target class samples to backdoor a model. By framing the attack as a Dataset Distillation problem, we develop three CLPBA variants -- Parameter Matching, Gradient Matching, and Feature Matching -- that craft effective poisons under both linear probing and full-finetuning training settings. In hard scenarios that require backdoor generalizability in the physical world, CLPBA is shown to even surpass Dirty-label attack baselines. We demonstrate the effectiveness of CLPBA via extensive experiments on two collected physical backdoor datasets for facial recognition and animal classification. The code is available in https://github.com/thinh-dao/Clean-Label-Physical-Backdoor-Attacks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes