Federated Learning of a Mixture of Global and Local Models
This work addresses communication bottlenecks in federated learning for distributed devices with private data, presenting an incremental improvement over existing methods.
The paper tackles the problem of communication efficiency in federated learning by proposing a formulation that balances global and local models, and it shows that local steps can improve communication for heterogeneous data and personalization reduces communication complexity.
We propose a new optimization formulation for training federated learning models. The standard formulation has the form of an empirical risk minimization problem constructed to find a single global model trained from the private data stored across all participating devices. In contrast, our formulation seeks an explicit trade-off between this traditional global model and the local models, which can be learned by each device from its own private data without any communication. Further, we develop several efficient variants of SGD (with and without partial participation and with and without variance reduction) for solving the new formulation and prove communication complexity guarantees. Notably, our methods are similar but not identical to federated averaging / local SGD, thus shedding some light on the role of local steps in federated learning. In particular, we are the first to i) show that local steps can improve communication for problems with heterogeneous data, and ii) point out that personalization yields reduced communication complexity.