SD LG ASOct 2, 2025

Multi-bit Audio Watermarking

Luca A. Lanzendörfer, Kyle Fearne, Florian Grötschla, Roger Wattenhofer

arXiv:2510.01968v17.01 citationsh-index: 24

Originality Incremental advance

AI Analysis

This provides an efficient, dataset-free solution for imperceptible audio watermarking, which is incremental as it builds on existing pretrained models like VAE and CLAP.

The paper tackles the problem of audio watermarking by introducing Timbru, a method that adds imperceptible watermarks to music snippets using gradient optimization in a pretrained VAE's latent space, achieving state-of-the-art robustness with the best average bit error rates against various attacks while preserving perceptual quality.

We present Timbru, a post-hoc audio watermarking model that achieves state-of-the-art robustness and imperceptibility trade-offs without training an embedder-detector model. Given any 44.1 kHz stereo music snippet, our method performs per-audio gradient optimization to add imperceptible perturbations in the latent space of a pretrained audio VAE, guided by a combined message and perceptual loss. The watermark can then be extracted using a pretrained CLAP model. We evaluate 16-bit watermarking on MUSDB18-HQ against AudioSeal, WavMark, and SilentCipher across common filtering, noise, compression, resampling, cropping, and regeneration attacks. Our approach attains the best average bit error rates, while preserving perceptual quality, demonstrating an efficient, dataset-free path to imperceptible audio watermarking.

View on arXiv PDF

Similar