Beyond Neural-on-Neural Approaches to Speaker Gender Protection
This work addresses privacy concerns for individuals by enhancing the robustness and interpretability of gender protection in speech, though it is incremental as it builds on existing methods.
The paper tackles the problem of evaluating speaker gender protection algorithms by demonstrating the need to test against non-neural attacks and compare with human-executed voice adaptations, resulting in improved interpretability and broader defense strategies.
Recent research has proposed approaches that modify speech to defend against gender inference attacks. The goal of these protection algorithms is to control the availability of information about a speaker's gender, a privacy-sensitive attribute. Currently, the common practice for developing and testing gender protection algorithms is "neural-on-neural", i.e., perturbations are generated and tested with a neural network. In this paper, we propose to go beyond this practice to strengthen the study of gender protection. First, we demonstrate the importance of testing gender inference attacks that are based on speech features historically developed by speech scientists, alongside the conventionally used neural classifiers. Next, we argue that researchers should use speech features to gain insight into how protective modifications change the speech signal. Finally, we point out that gender-protection algorithms should be compared with novel "vocal adversaries", human-executed voice adaptations, in order to improve interpretability and enable before-the-mic protection.