CVFeb 3

Invisible Clean-Label Backdoor Attacks for Generative Data Augmentation

Ting Xiang, Jinhui Zhao, Changjian Chen, Zhuo Tang

arXiv:2602.03316v11.5h-index: 3

Originality Incremental advance

AI Analysis

This addresses a security vulnerability in generative data augmentation for machine learning practitioners, but it is incremental as it builds on existing attack methods by shifting to latent features.

The paper tackles the problem of low attack success rates in clean-label backdoor attacks on generative data augmentation by proposing InvLBA, a method that uses latent perturbations instead of pixel-level triggers, resulting in an average improvement of 46.43% in attack success rate with minimal impact on clean accuracy.

With the rapid advancement of image generative models, generative data augmentation has become an effective way to enrich training images, especially when only small-scale datasets are available. At the same time, in practical applications, generative data augmentation can be vulnerable to clean-label backdoor attacks, which aim to bypass human inspection. However, based on theoretical analysis and preliminary experiments, we observe that directly applying existing pixel-level clean-label backdoor attack methods (e.g., COMBAT) to generated images results in low attack success rates. This motivates us to move beyond pixel-level triggers and focus instead on the latent feature level. To this end, we propose InvLBA, an invisible clean-label backdoor attack method for generative data augmentation by latent perturbation. We theoretically prove that the generalization of the clean accuracy and attack success rates of InvLBA can be guaranteed. Experiments on multiple datasets show that our method improves the attack success rate by 46.43% on average, with almost no reduction in clean accuracy and high robustness against SOTA defense methods.

View on arXiv PDF

Similar