A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting
This addresses efficiency and scalability issues in federated learning for applications with limited node participation, though it appears incremental as it builds on existing components.
The authors tackled the challenge of distributed optimization in federated learning with partial participation by developing a method that combines variance reduction, partial participation, and compressed communication, achieving optimal oracle complexity and state-of-the-art communication complexity without requiring bounded gradient assumptions.
We present a new method that includes three key components of distributed optimization and federated learning: variance reduction of stochastic gradients, partial participation, and compressed communication. We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting. Regardless of the communication compression feature, our method successfully combines variance reduction and partial participation: we get the optimal oracle complexity, never need the participation of all nodes, and do not require the bounded gradients (dissimilarity) assumption.