Speaker anonymisation using the McAdams coefficient
This work addresses privacy concerns in speech processing by providing an efficient, training-free method for speaker anonymization, though it is incremental as it builds on existing signal processing techniques.
The paper tackled the problem of speaker anonymization by manipulating speech signals to hinder automatic speaker recognition while preserving intelligibility and naturalness, using the McAdams coefficient to transform spectral envelopes, resulting in random, optimized transformations that outperformed competing solutions in anonymization with only modest degradations to intelligibility, as shown on VoicePrivacy 2020 databases.
Anonymisation has the goal of manipulating speech signals in order to degrade the reliability of automatic approaches to speaker recognition, while preserving other aspects of speech, such as those relating to intelligibility and naturalness. This paper reports an approach to anonymisation that, unlike other current approaches, requires no training data, is based upon well-known signal processing techniques and is both efficient and effective. The proposed solution uses the McAdams coefficient to transform the spectral envelope of speech signals. Results derived using common VoicePrivacy 2020 databases and protocols show that random, optimised transformations can outperform competing solutions in terms of anonymisation while causing only modest, additional degradations to intelligibility, even in the case of a semi-informed privacy adversary.