MMNov 9, 2018

Distribution-Preserving Steganography Based on Text-to-Speech Generative Models

Kejiang Chen, Hang Zhou, Hanqing Zhao, Dongdong Chen, Weiming Zhang, Nenghai Yu

arXiv:1811.03732v38.05 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of secure covert communication in audio media, but it is incremental as it applies existing steganographic frameworks to generative models.

The paper tackles the problem of hiding secret messages in text-to-speech audio without detection by proposing distribution-preserving steganography methods based on WaveGlow and WaveNet, achieving results that preserve the distribution as demonstrated through steganalysis experiments and theoretical analysis.

Steganography is the art and science of hiding secret messages in public communication so that the presence of the secret messages cannot be detected. There are two distribution-preserving steganographic frameworks, one is sampling-based and the other is compression-based. The former requires a perfect sampler which yields data following the same distribution, and the latter needs explicit distribution of generative objects. However, these two conditions are too strict even unrealistic in the traditional data environment, e.g. the distribution of natural images is hard to seize. Fortunately, generative models bring new vitality to distribution-preserving steganography, which can serve as the perfect sampler or provide the explicit distribution of generative media. Take text-to-speech generation task as an example, we propose distribution-preserving steganography based on WaveGlow and WaveNet, which corresponds to the former two categories. Steganalysis experiments and theoretical analysis are conducted to demonstrate that the proposed methods can preserve the distribution.

View on arXiv PDF

Similar