CRAICVDec 4, 2024

Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models

arXiv:2412.03283v342 citationsh-index: 13CVPR
Originality Highly original
AI Analysis

This reveals a fundamental vulnerability in watermarking techniques for AI-generated content, posing a problem for content attribution and detection systems.

The paper tackles the security of semantic watermarks in latent diffusion models by demonstrating that attackers can forge or remove watermarks using unrelated models with just a single reference image, achieving high success rates (e.g., up to 99% forgery accuracy).

Integrating watermarking into the generation process of latent diffusion models (LDMs) simplifies detection and attribution of generated content. Semantic watermarks, such as Tree-Rings and Gaussian Shading, represent a novel class of watermarking techniques that are easy to implement and highly robust against various perturbations. However, our work demonstrates a fundamental security vulnerability of semantic watermarks. We show that attackers can leverage unrelated models, even with different latent spaces and architectures (UNet vs DiT), to perform powerful and realistic forgery attacks. Specifically, we design two watermark forgery attacks. The first imprints a targeted watermark into real images by manipulating the latent representation of an arbitrary image in an unrelated LDM to get closer to the latent representation of a watermarked image. We also show that this technique can be used for watermark removal. The second attack generates new images with the target watermark by inverting a watermarked image and re-generating it with an arbitrary prompt. Both attacks just need a single reference image with the target watermark. Overall, our findings question the applicability of semantic watermarks by revealing that attackers can easily forge or remove these watermarks under realistic conditions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes