D2RA: Dual Domain Regeneration Attack
This exposes fundamental weaknesses in current watermarking designs, posing a problem for content attribution and provenance in generative AI.
The paper tackled the vulnerability of semantic watermarking schemes for generative models by introducing D2RA, a training-free attack that removes or weakens watermarks, showing it consistently reduces detectability across diverse schemes.
The growing use of generative models has intensified the need for watermarking methods that ensure content attribution and provenance. While recent semantic watermarking schemes improve robustness by embedding signals in latent or frequency representations, we show they remain vulnerable even under resource-constrained adversarial settings. We present D2RA, a training-free, single-image attack that removes or weakens watermarks without access to the underlying model. By projecting watermarked images onto natural priors across complementary representations, D2RA suppresses watermark signals while preserving visual fidelity. Experiments across diverse watermarking schemes demonstrate that our approach consistently reduces watermark detectability, revealing fundamental weaknesses in current designs. Our code is available at https://github.com/Pragati-Meshram/DAWN.