Global Convergence of Federated Learning for Mixed Regression
This addresses federated learning for clustered data distributions, which is an incremental but practically important extension of standard federated learning.
The paper tackles federated learning with clustered clients in mixed regression, achieving global convergence from any initialization even with highly unbalanced local data volumes (some clients have only O(1) data points).
This paper studies the problem of model training under Federated Learning when clients exhibit cluster structure. We contextualize this problem in mixed regression, where each client has limited local data generated from one of $k$ unknown regression models. We design an algorithm that achieves global convergence from any initialization, and works even when local data volume is highly unbalanced -- there could exist clients that contain $O(1)$ data points only. Our algorithm first runs moment descent on a few anchor clients (each with $\tildeΩ(k)$ data points) to obtain coarse model estimates. Then each client alternately estimates its cluster labels and refines the model estimates based on FedAvg or FedProx. A key innovation in our analysis is a uniform estimate on the clustering errors, which we prove by bounding the VC dimension of general polynomial concept classes based on the theory of algebraic geometry.