LGCRNov 14, 2023

Sparsity-Preserving Differentially Private Training of Large Embedding Models

arXiv:2311.08357v17 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses privacy concerns in recommendation systems and language applications, offering an incremental improvement over DP-SGD by preserving sparsity for more efficient training.

The paper tackles the problem of gradient sparsity loss in differentially private training of large embedding models, presenting DP-FEST and DP-AdaFEST algorithms that achieve a 10^6× reduction in gradient size while maintaining comparable accuracy on real-world datasets.

As the use of large embedding models in recommendation systems and language applications increases, concerns over user data privacy have also risen. DP-SGD, a training algorithm that combines differential privacy with stochastic gradient descent, has been the workhorse in protecting user privacy without compromising model accuracy by much. However, applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency. To address this issue, we present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models. Our algorithms achieve substantial reductions ($10^6 \times$) in gradient size, while maintaining comparable levels of accuracy, on benchmark real-world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes