SD AI ASJun 12, 2024

Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding

Rui Wang, Liping Chen, Kong AiK Lee, Zhen-Hua Ling

arXiv:2406.08200v38.311 citations

Originality Incremental advance

AI Analysis

This addresses privacy concerns in speech data by enabling asynchronous voice anonymization, though it is incremental as it builds on existing speaker disentanglement and adversarial perturbation techniques.

The paper tackled the problem of voice anonymization by altering speaker attributes against machine recognition while preserving human perception, achieving a 60.71% success rate on the LibriSpeech dataset.

Voice anonymization has been developed as a technique for preserving privacy by replacing the speaker's voice in a speech signal with that of a pseudo-speaker, thereby obscuring the original voice attributes from machine recognition and human perception. In this paper, we focus on altering the voice attributes against machine recognition while retaining human perception. We referred to this as the asynchronous voice anonymization. To this end, a speech generation framework incorporating a speaker disentanglement mechanism is employed to generate the anonymized speech. The speaker attributes are altered through adversarial perturbation applied on the speaker embedding, while human perception is preserved by controlling the intensity of perturbation. Experiments conducted on the LibriSpeech dataset showed that the speaker attributes were obscured with their human perception preserved for 60.71% of the processed utterances.

View on arXiv PDF

Similar