Adversarial Privacy Protection on Speech Enhancement
This work addresses privacy protection against malicious speech extraction, representing a novel application of adversarial attacks in the speech domain.
The paper tackles the problem of private speech content being maliciously extracted via speech enhancement systems by proposing an adversarial method to degrade these systems, resulting in a word error rate of 89.0% for erasing content and 33.75% for target attacks.
Speech is easily leaked imperceptibly, such as being recorded by mobile phones in different situations. Private content in speech may be maliciously extracted through speech enhancement technology. Speech enhancement technology has developed rapidly along with deep neural networks (DNNs), but adversarial examples can cause DNNs to fail. In this work, we propose an adversarial method to degrade speech enhancement systems. Experimental results show that generated adversarial examples can erase most content information in original examples or replace it with target speech content through speech enhancement. The word error rate (WER) between an enhanced original example and enhanced adversarial example recognition result can reach 89.0%. WER of target attack between enhanced adversarial example and target example is low to 33.75% . Adversarial perturbation can bring the rate of change to the original example to more than 1.4430. This work can prevent the malicious extraction of speech.