Differentially Private Federated Learning on Heterogeneous Data
This addresses privacy and efficiency issues in federated learning for users with heterogeneous data, representing an incremental improvement over existing methods.
The paper tackled the challenges of training efficiently from heterogeneous user data and protecting privacy in federated learning by proposing DP-SCAFFOLD, which incorporates differential privacy into the SCAFFOLD algorithm, showing superiority over DP-FedAvg with gains in performance as heterogeneity and local updates increase.
Federated Learning (FL) is a paradigm for large-scale distributed learning which faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of participating users. In this work, we propose a novel FL approach (DP-SCAFFOLD) to tackle these two challenges together by incorporating Differential Privacy (DP) constraints into the popular SCAFFOLD algorithm. We focus on the challenging setting where users communicate with a "honest-but-curious" server without any trusted intermediary, which requires to ensure privacy not only towards a third-party with access to the final model but also towards the server who observes all user communications. Using advanced results from DP theory, we establish the convergence of our algorithm for convex and non-convex objectives. Our analysis clearly highlights the privacy-utility trade-off under data heterogeneity, and demonstrates the superiority of DP-SCAFFOLD over the state-of-the-art algorithm DP-FedAvg when the number of local updates and the level of heterogeneity grow. Our numerical results confirm our analysis and show that DP-SCAFFOLD provides significant gains in practice.