Differentially Private Learning with Adaptive Clipping
This addresses the problem of hyperparameter tuning in differentially private federated learning for researchers and practitioners, though it is incremental as it builds on existing DP-FedAvg methods.
The paper tackles the challenge of setting clipping norms in differentially private federated learning by proposing an adaptive clipping method that clips to a quantile of the update norm distribution, estimated online with differential privacy. Experiments show it performs well across tasks, sometimes outperforming the best fixed clip chosen in hindsight, without tuning hyperparameters.
Existing approaches for training neural networks with user-level differential privacy (e.g., DP Federated Averaging) in federated learning (FL) settings involve bounding the contribution of each user's model update by clipping it to some constant value. However there is no good a priori setting of the clipping norm across tasks and learning settings: the update norm distribution depends on the model architecture and loss, the amount of data on each device, the client learning rate, and possibly various other parameters. We propose a method wherein instead of a fixed clipping norm, one clips to a value at a specified quantile of the update norm distribution, where the value at the quantile is itself estimated online, with differential privacy. The method tracks the quantile closely, uses a negligible amount of privacy budget, is compatible with other federated learning technologies such as compression and secure aggregation, and has a straightforward joint DP analysis with DP-FedAvg. Experiments demonstrate that adaptive clipping to the median update norm works well across a range of realistic federated learning tasks, sometimes outperforming even the best fixed clip chosen in hindsight, and without the need to tune any clipping hyperparameter.