CLCRLGMay 8, 2025

Privacy-Preserving Transformers: SwiftKey's Differential Privacy Implementation

Microsoft
arXiv:2505.05648v11 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses privacy concerns in mobile keyboard applications, though it appears incremental as it adapts existing methods to a specific use case.

The authors tackled the problem of training privacy-preserving language models for mobile keyboards by implementing differential privacy in a transformer architecture, achieving small but consistent gains in next-word prediction accuracy with manageable increases in memory and speed compared to existing GRU models.

In this paper we train a transformer using differential privacy (DP) for language modeling in SwiftKey. We run multiple experiments to balance the trade-off between the model size, run-time speed and accuracy. We show that we get small and consistent gains in the next-word-prediction and accuracy with graceful increase in memory and speed compared to the production GRU. This is obtained by scaling down a GPT2 architecture to fit the required size and a two stage training process that builds a seed model on general data and DP finetunes it on typing data. The transformer is integrated using ONNX offering both flexibility and efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes