Defense Against Adversarial Attacks on Audio DeepFake Detection
This work addresses security threats from audio DeepFakes by improving detector robustness against adversarial attacks, though it is incremental as it builds on existing detection methods.
The paper tackled the problem of adversarial attacks degrading audio DeepFake detection performance by evaluating three detection architectures under white-box and transferability scenarios, and enhanced robustness using adversarial training with a novel adaptive method, achieving improved detection rates (e.g., adversarial training reduced error rates by up to 15% in some cases).
Audio DeepFakes (DF) are artificially generated utterances created using deep learning, with the primary aim of fooling the listeners in a highly convincing manner. Their quality is sufficient to pose a severe threat in terms of security and privacy, including the reliability of news or defamation. Multiple neural network-based methods to detect generated speech have been proposed to prevent the threats. In this work, we cover the topic of adversarial attacks, which decrease the performance of detectors by adding superficial (difficult to spot by a human) changes to input data. Our contribution contains evaluating the robustness of 3 detection architectures against adversarial attacks in two scenarios (white-box and using transferability) and enhancing it later by using adversarial training performed by our novel adaptive training. Moreover, one of the investigated architectures is RawNet3, which, to the best of our knowledge, we adapted for the first time to DeepFake detection.