CV AI CRMay 13, 2025

Removing Watermarks with Partial Regeneration using Semantic Information

Krti Tallam, John Kevin Cava, Caleb Geniesse, N. Benjamin Erichson, Michael W. Mahoney

arXiv:2505.08234v13 citationsh-index: 25

Originality Highly original

AI Analysis

This exposes a critical security gap in watermarking defenses for copyright protection of AI-generated content, highlighting the need for more resilient algorithms.

The paper tackled the vulnerability of invisible watermarks in AI-generated images to adaptive attacks, introducing SemanticRegen, a three-stage attack that successfully erases state-of-the-art semantic and invisible watermarks while maintaining high perceptual quality, with results including defeating the TreeRing watermark and reducing bit-accuracy below 0.75 for other schemes.

As AI-generated imagery becomes ubiquitous, invisible watermarks have emerged as a primary line of defense for copyright and provenance. The newest watermarking schemes embed semantic signals - content-aware patterns that are designed to survive common image manipulations - yet their true robustness against adaptive adversaries remains under-explored. We expose a previously unreported vulnerability and introduce SemanticRegen, a three-stage, label-free attack that erases state-of-the-art semantic and invisible watermarks while leaving an image's apparent meaning intact. Our pipeline (i) uses a vision-language model to obtain fine-grained captions, (ii) extracts foreground masks with zero-shot segmentation, and (iii) inpaints only the background via an LLM-guided diffusion model, thereby preserving salient objects and style cues. Evaluated on 1,000 prompts across four watermarking systems - TreeRing, StegaStamp, StableSig, and DWT/DCT - SemanticRegen is the only method to defeat the semantic TreeRing watermark (p = 0.10 > 0.05) and reduces bit-accuracy below 0.75 for the remaining schemes, all while maintaining high perceptual quality (masked SSIM = 0.94 +/- 0.01). We further introduce masked SSIM (mSSIM) to quantify fidelity within foreground regions, showing that our attack achieves up to 12 percent higher mSSIM than prior diffusion-based attackers. These results highlight an urgent gap between current watermark defenses and the capabilities of adaptive, semantics-aware adversaries, underscoring the need for watermarking algorithms that are resilient to content-preserving regenerative attacks.

View on arXiv PDF

Similar