A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
This addresses the challenge of efficiently tuning hyperparameters for differentially private models, which is crucial for researchers and practitioners in privacy-preserving machine learning, representing a strong specific gain rather than an incremental improvement.
The paper tackles the problem of hyperparameter optimization in differentially private deep learning, which is costly in terms of privacy and runtime, by proposing an adaptive method that uses cheap trials to estimate and scale optimal hyperparameters, achieving state-of-the-art performance on 22 benchmark tasks across various domains and privacy budgets while accounting for HPO privacy costs.
An open problem in differentially private deep learning is hyperparameter optimization (HPO). DP-SGD introduces new hyperparameters and complicates existing ones, forcing researchers to painstakingly tune hyperparameters with hundreds of trials, which in turn makes it impossible to account for the privacy cost of HPO without destroying the utility. We propose an adaptive HPO method that uses cheap trials (in terms of privacy cost and runtime) to estimate optimal hyperparameters and scales them up. We obtain state-of-the-art performance on 22 benchmark tasks, across computer vision and natural language processing, across pretraining and finetuning, across architectures and a wide range of $\varepsilon \in [0.01,8.0]$, all while accounting for the privacy cost of HPO.