SD CR ASFeb 17, 2022

Attributable-Watermarking of Speech Generative Models

Yongbaek Cho, Changhoon Kim, Yezhou Yang, Yi Ren

arXiv:2202.08900v27.113 citations

Originality Incremental advance

AI Analysis

This addresses concerns like impersonation and IP theft for speech synthesis users, but is incremental as it builds on prior work in image domain.

The paper tackles the problem of attributing synthetic speech to its source generative model by embedding watermarks, achieving high attribution accuracy while maintaining generation quality, and demonstrates trade-offs and robustness against attacks.

Generative models are now capable of synthesizing images, speeches, and videos that are hardly distinguishable from authentic contents. Such capabilities cause concerns such as malicious impersonation and IP theft. This paper investigates a solution for model attribution, i.e., the classification of synthetic contents by their source models via watermarks embedded in the contents. Building on past success of model attribution in the image domain, we discuss algorithmic improvements for generating user-end speech models that empirically achieve high attribution accuracy, while maintaining high generation quality. We show the trade off between attributability and generation quality under a variety of attacks on generated speech signals attempting to remove the watermarks, and the feasibility of learning robust watermarks against these attacks.

View on arXiv PDF

Similar