PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification
This work addresses privacy concerns in speech applications for users and developers, but it is incremental as it modifies an existing framework by replacing GAN with AAE.
The paper tackled the problem of ensuring differential privacy in speech command classification when public non-sensitive data is scarce, by proposing PATE-AAE, which improved average classification accuracy by +2.11% and +6.60% compared to alternative privacy-preserving methods while maintaining strong privacy at ε=0.01.
We propose using an adversarial autoencoder (AAE) to replace generative adversarial network (GAN) in the private aggregation of teacher ensembles (PATE), a solution for ensuring differential privacy in speech applications. The AAE architecture allows us to obtain good synthetic speech leveraging upon a discriminative training of latent vectors. Such synthetic speech is used to build a privacy-preserving classifier when non-sensitive data is not sufficiently available in the public domain. This classifier follows the PATE scheme that uses an ensemble of noisy outputs to label the synthetic samples and guarantee $\varepsilon$-differential privacy (DP) on its derived classifiers. Our proposed framework thus consists of an AAE-based generator and a PATE-based classifier (PATE-AAE). Evaluated on the Google Speech Commands Dataset Version II, the proposed PATE-AAE improves the average classification accuracy by +$2.11\%$ and +$6.60\%$, respectively, when compared with alternative privacy-preserving solutions, namely PATE-GAN and DP-GAN, while maintaining a strong level of privacy target at $\varepsilon$=0.01 with a fixed $δ$=10$^{-5}$.