LG CRFeb 21, 2023

Speech Privacy Leakage from Shared Gradients in Distributed Learning

arXiv:2302.10441v15.33 citationsh-index: 16

Originality Incremental advance

AI Analysis

This addresses privacy risks for users in speech-based distributed learning applications, but it is incremental as it extends known gradient leakage attacks from images to speech.

The paper tackled the problem of privacy leakage from shared gradients in distributed learning for speech analysis, demonstrating that speech content and speaker identity can be recovered with measurable similarity between original and recovered signals.

Distributed machine learning paradigms, such as federated learning, have been recently adopted in many privacy-critical applications for speech analysis. However, such frameworks are vulnerable to privacy leakage attacks from shared gradients. Despite extensive efforts in the image domain, the exploration of speech privacy leakage from gradients is quite limited. In this paper, we explore methods for recovering private speech/speaker information from the shared gradients in distributed learning settings. We conduct experiments on a keyword spotting model with two different types of speech features to quantify the amount of leaked information by measuring the similarity between the original and recovered speech signals. We further demonstrate the feasibility of inferring various levels of side-channel information, including speech content and speaker identity, under the distributed learning framework without accessing the user's data.

View on arXiv PDF

Similar