CVJan 4, 2024

Spy-Watermark: Robust Invisible Watermarking for Backdoor Attack

Ruofei Wang, Renjie Wan, Zongyu Guo, Qing Guo, Rui Huang

arXiv:2401.02031v19.613 citationsh-index: 5Has CodeICASSP

Originality Incremental advance

AI Analysis

This addresses the robustness issue in backdoor attacks for machine learning security, though it is incremental as it builds on existing trigger methods.

The paper tackled the problem of backdoor attacks being vulnerable to data corruption and defense mechanisms by proposing Spy-Watermark, a method that embeds a learnable watermark in the latent domain of images as a trigger, resulting in outperforming ten state-of-the-art methods in robustness and stealthiness on datasets like CIFAR10, GTSRB, and ImageNet.

Backdoor attack aims to deceive a victim model when facing backdoor instances while maintaining its performance on benign data. Current methods use manual patterns or special perturbations as triggers, while they often overlook the robustness against data corruption, making backdoor attacks easy to defend in practice. To address this issue, we propose a novel backdoor attack method named Spy-Watermark, which remains effective when facing data collapse and backdoor defense. Therein, we introduce a learnable watermark embedded in the latent domain of images, serving as the trigger. Then, we search for a watermark that can withstand collapse during image decoding, cooperating with several anti-collapse operations to further enhance the resilience of our trigger against data corruption. Extensive experiments are conducted on CIFAR10, GTSRB, and ImageNet datasets, demonstrating that Spy-Watermark overtakes ten state-of-the-art methods in terms of robustness and stealthiness.

View on arXiv PDF Code

Similar