ASCLMay 18, 2020

Design Choices for X-vector Based Speaker Anonymization

arXiv:2005.08601v181 citations
AI Analysis

This work addresses privacy concerns in voice data for users and applications, but it is incremental as it builds on existing anonymization methods for a specific challenge.

The paper tackles the problem of speaker anonymization by exploring design choices for an x-vector based scheme, such as distance metrics and pseudo-speaker selection, to optimize privacy and utility, reporting results like Equal Error Rate and Word Error Rate from experiments on LibriSpeech datasets.

The recently proposed x-vector based anonymization scheme converts any input voice into that of a random pseudo-speaker. In this paper, we present a flexible pseudo-speaker selection technique as a baseline for the first VoicePrivacy Challenge. We explore several design choices for the distance metric between speakers, the region of x-vector space where the pseudo-speaker is picked, and gender selection. To assess the strength of anonymization achieved, we consider attackers using an x-vector based speaker verification system who may use original or anonymized speech for enrollment, depending on their knowledge of the anonymization scheme. The Equal Error Rate (EER) achieved by the attackers and the decoding Word Error Rate (WER) over anonymized data are reported as the measures of privacy and utility. Experiments are performed using datasets derived from LibriSpeech to find the optimal combination of design choices in terms of privacy and utility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes