Coordinating Momenta for Cross-silo Federated Learning
This work aims to improve the training performance of cross-silo federated learning for practitioners dealing with non-i.i.d. data distributions and communication constraints, offering an incremental improvement to existing methods.
This paper addresses the client drift problem in federated learning caused by non-i.i.d. data, which arises when clients perform multiple local training steps to reduce communication. The authors propose a method using double momentum buffers and a novel momentum fusion technique to coordinate server and local model updates, demonstrating improved training performance over FedAvg and existing momentum SGD variants.
Communication efficiency is crucial for federated learning (FL). Conducting local training steps in clients to reduce the communication frequency between clients and the server is a common method to address this issue. However, this strategy leads to the client drift problem due to \textit{non-i.i.d.} data distributions in different clients which severely deteriorates the performance. In this work, we propose a new method to improve the training performance in cross-silo FL via maintaining double momentum buffers. In our algorithm, one momentum buffer is used to track the server model updating direction, and the other one is adopted to track the local model updating direction. More important, we introduce a novel momentum fusion technique to coordinate the server and local momentum buffers. We also derive the first theoretical convergence analysis involving both the server and local standard momentum SGD. Extensive deep FL experimental results verify that our new approach has a better training performance than the FedAvg and existing standard momentum SGD variants.