Equivariant Differentially Private Deep Learning: Why DP-SGD Needs Sparser Models
This work addresses the privacy-utility gap in differentially private deep learning for computer vision, offering a more efficient and accurate approach compared to existing methods.
The paper tackles the challenge of training deep learning models with Differential Privacy (DP-SGD), which suffers from high computational costs and reduced accuracy, especially for large models. By using equivariant convolutional networks to design sparse models, they achieve up to 9% higher accuracy on CIFAR-10 and reduce computation time by over 85%.
Differentially Private Stochastic Gradient Descent (DP-SGD) limits the amount of private information deep learning models can memorize during training. This is achieved by clipping and adding noise to the model's gradients, and thus networks with more parameters require proportionally stronger perturbation. As a result, large models have difficulties learning useful information, rendering training with DP-SGD exceedingly difficult on more challenging training tasks. Recent research has focused on combating this challenge through training adaptations such as heavy data augmentation and large batch sizes. However, these techniques further increase the computational overhead of DP-SGD and reduce its practical applicability. In this work, we propose using the principle of sparse model design to solve precisely such complex tasks with fewer parameters, higher accuracy, and in less time, thus serving as a promising direction for DP-SGD. We achieve such sparsity by design by introducing equivariant convolutional networks for model training with Differential Privacy. Using equivariant networks, we show that small and efficient architecture design can outperform current state-of-the-art models with substantially lower computational requirements. On CIFAR-10, we achieve an increase of up to $9\%$ in accuracy while reducing the computation time by more than $85\%$. Our results are a step towards efficient model architectures that make optimal use of their parameters and bridge the privacy-utility gap between private and non-private deep learning for computer vision.