Federated Learning Based on Dynamic Regularization
This addresses communication bottlenecks in federated learning for distributed neural network training, offering an incremental improvement over prior methods.
The paper tackles the inconsistency between local and global minima in federated learning by introducing a dynamic regularizer for each device, which aligns solutions and reduces communication costs. Empirical and analytical results show efficient training in convex and non-convex settings, with robustness to device heterogeneity, large device counts, partial participation, and unbalanced data.
We propose a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round. We view Federated Learning problem primarily from a communication perspective and allow more device level computations to save transmission costs. We point out a fundamental dilemma, in that the minima of the local-device level empirical loss are inconsistent with those of the global empirical loss. Different from recent prior works, that either attempt inexact minimization or utilize devices for parallelizing gradient computation, we propose a dynamic regularizer for each device at each round, so that in the limit the global and device solutions are aligned. We demonstrate both through empirical results on real and synthetic data as well as analytical results that our scheme leads to efficient training, in both convex and non-convex settings, while being fully agnostic to device heterogeneity and robust to large number of devices, partial participation and unbalanced data.