Differentially private training of residual networks with scale normalisation
This work addresses the problem of maintaining accuracy while ensuring differential privacy in machine learning, particularly for residual networks, representing an incremental improvement over recent state-of-the-art methods.
The paper tackles the accuracy trade-offs in differentially private training of neural networks by proposing ScaleNorm, an architectural modification for residual networks with Group Normalization, achieving a top-1 accuracy of 82.5% on CIFAR-10 at ε=8.0.
The training of neural networks with Differentially Private Stochastic Gradient Descent offers formal Differential Privacy guarantees but introduces accuracy trade-offs. In this work, we propose to alleviate these trade-offs in residual networks with Group Normalisation through a simple architectural modification termed ScaleNorm by which an additional normalisation layer is introduced after the residual block's addition operation. Our method allows us to further improve on the recently reported state-of-the art on CIFAR-10, achieving a top-1 accuracy of 82.5% (ε=8.0) when trained from scratch.