SDLGNEASOct 11, 2022

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition

Georgia Tech
arXiv:2210.05614v23 citationsh-index: 73Has Code
Originality Incremental advance
AI Analysis

This work addresses privacy-preserving training for speech recognition, offering a solution for scenarios requiring strict data protection, though it is incremental as it adapts an existing method to a new domain.

The paper tackles the problem of performance degradation in automatic speech recognition (ASR) under differential privacy constraints by extending Private Aggregation of Teacher Ensemble (PATE) learning to handle dynamic speech patterns, resulting in relative word error rate reductions of 26.2% to 27.5% for an RNN transducer model on LibriSpeech compared to benchmark methods.

Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data. Such a noise perturbation often results in a severe performance degradation in automatic speech recognition (ASR) in order to meet a privacy budget $\varepsilon$. Private aggregation of teacher ensemble (PATE) utilizes ensemble probabilities to improve ASR accuracy when dealing with the noise effects controlled by small values of $\varepsilon$. We extend PATE learning to work with dynamic patterns, namely speech utterances, and perform a first experimental demonstration that it prevents acoustic data leakage in ASR training. We evaluate three end-to-end deep models, including LAS, hybrid CTC/attention, and RNN transducer, on the open-source LibriSpeech and TIMIT corpora. PATE learning-enhanced ASR models outperform the benchmark DP-SGD mechanisms, especially under strict DP budgets, giving relative word error rate reductions between 26.2% and 27.5% for an RNN transducer model evaluated with LibriSpeech. We also introduce a DP-preserving ASR solution for pretraining on public speech corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes