LGCRSTJun 2, 2024

Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

arXiv:2406.00588v114 citations
Originality Incremental advance
AI Analysis

This addresses the lack of theoretical understanding for backdoor attacks in machine learning security, offering insights for researchers and practitioners, though it is incremental in building on existing attack frameworks.

The paper establishes the first generalization bounds for clean-label backdoor attacks, providing upper bounds for clean and poison population errors based on empirical error, and proposes a new attack method combining adversarial noise and indiscriminate poison, showing effectiveness in various settings.

The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpose of the attack is two-fold. To our knowledge, the generalization bound for the backdoor attack has not been established. In this paper, we fill this gap by deriving algorithm-independent generalization bounds in the clean-label backdoor attack scenario. Precisely, based on the goals of backdoor attack, we give upper bounds for the clean sample population errors and the poison population errors in terms of the empirical error on the poisoned training dataset. Furthermore, based on the theoretical result, a new clean-label backdoor attack is proposed that computes the poisoning trigger by combining adversarial noise and indiscriminate poison. We show its effectiveness in a variety of settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes