LGCRFeb 21, 2023

Speech Privacy Leakage from Shared Gradients in Distributed Learning

arXiv:2302.10441v13 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses privacy risks for users in speech-based distributed learning applications, but it is incremental as it extends known gradient leakage attacks from images to speech.

The paper tackled the problem of privacy leakage from shared gradients in distributed learning for speech analysis, demonstrating that speech content and speaker identity can be recovered with measurable similarity between original and recovered signals.

Distributed machine learning paradigms, such as federated learning, have been recently adopted in many privacy-critical applications for speech analysis. However, such frameworks are vulnerable to privacy leakage attacks from shared gradients. Despite extensive efforts in the image domain, the exploration of speech privacy leakage from gradients is quite limited. In this paper, we explore methods for recovering private speech/speaker information from the shared gradients in distributed learning settings. We conduct experiments on a keyword spotting model with two different types of speech features to quantify the amount of leaked information by measuring the similarity between the original and recovered speech signals. We further demonstrate the feasibility of inferring various levels of side-channel information, including speech content and speaker identity, under the distributed learning framework without accessing the user's data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes