CR SD ASApr 16, 2020

Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release

Yaowei Han, Sheng Li, Yang Cao, Qiang Ma, Masatoshi Yoshikawa

arXiv:2004.07442v121.753 citations

Originality Incremental advance

AI Analysis

This addresses privacy concerns for users of smart devices like Amazon Echo and Apple HomePod by providing a formal privacy definition, though it appears incremental as it extends existing differential privacy concepts to speech data.

The authors tackled the problem of protecting speaker identity (voiceprint) in speech data release by proposing a new privacy metric called voice-indistinguishability, based on differential privacy, and developed mechanisms that showed effectiveness and efficiency in experiments on public datasets.

With the development of smart devices, such as the Amazon Echo and Apple's HomePod, speech data have become a new dimension of big data. However, privacy and security concerns may hinder the collection and sharing of real-world speech data, which contain the speaker's identifiable information, i.e., voiceprint, which is considered a type of biometric identifier. Current studies on voiceprint privacy protection do not provide either a meaningful privacy-utility trade-off or a formal and rigorous definition of privacy. In this study, we design a novel and rigorous privacy metric for voiceprint privacy, which is referred to as voice-indistinguishability, by extending differential privacy. We also propose mechanisms and frameworks for privacy-preserving speech data release satisfying voice-indistinguishability. Experiments on public datasets verify the effectiveness and efficiency of the proposed methods.

View on arXiv PDF

Similar