CRCVMar 12, 2024

Backdoor Attack with Mode Mixture Latent Modification

arXiv:2403.07463v11 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses security concerns for deep neural networks by enabling stealthy backdoor injection, though it appears incremental as it builds on existing attack paradigms with a focus on reduced parameter changes.

The paper tackles the problem of backdoor attacks on deep neural networks by proposing a method that requires minimal alterations to a clean model, specifically modifying only the output layer, and achieves effective attacks on four benchmark datasets including MNIST, CIFAR-10, GTSRB, and TinyImageNet.

Backdoor attacks become a significant security concern for deep neural networks in recent years. An image classification model can be compromised if malicious backdoors are injected into it. This corruption will cause the model to function normally on clean images but predict a specific target label when triggers are present. Previous research can be categorized into two genres: poisoning a portion of the dataset with triggered images for users to train the model from scratch, or training a backdoored model alongside a triggered image generator. Both approaches require significant amount of attackable parameters for optimization to establish a connection between the trigger and the target label, which may raise suspicions as more people become aware of the existence of backdoor attacks. In this paper, we propose a backdoor attack paradigm that only requires minimal alterations (specifically, the output layer) to a clean model in order to inject the backdoor under the guise of fine-tuning. To achieve this, we leverage mode mixture samples, which are located between different modes in latent space, and introduce a novel method for conducting backdoor attacks. We evaluate the effectiveness of our method on four popular benchmark datasets: MNIST, CIFAR-10, GTSRB, and TinyImageNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes