On the invertibility of a voice privacy system using embedding alignement
This reveals a vulnerability in voice privacy systems, which is an incremental finding for security and privacy applications.
The paper tackles the problem of voice anonymization by showing that a complex system can be approximated as a reversible rotation, allowing recovery of up to 62% of speaker identities from anonymized embeddings.
This paper explores various attack scenarios on a voice anonymization system using embeddings alignment techniques. We use Wasserstein-Procrustes (an algorithm initially designed for unsupervised translation) or Procrustes analysis to match two sets of x-vectors, before and after voice anonymization, to mimic this transformation as a rotation function. We compute the optimal rotation and compare the results of this approximation to the official Voice Privacy Challenge results. We show that a complex system like the baseline of the Voice Privacy Challenge can be approximated by a rotation, estimated using a limited set of x-vectors. This paper studies the space of solutions for voice anonymization within the specific scope of rotations. Rotations being reversible, the proposed method can recover up to 62% of the speaker identities from anonymized embeddings.