On the Generalization of Wasserstein Robust Federated Learning
This addresses generalization issues in federated learning for clients with non-i.i.d. data, offering a robust method with potential applications in domain adaptation, though it appears incremental as it builds on existing distributionally robust optimization approaches.
The paper tackles the challenge of generalization in federated learning with non-i.i.d. data by proposing WAFL, a Wasserstein distributionally robust optimization scheme, which shows improved generalization over FedAvg and robustness in distribution shift settings, with empirical results on benchmark datasets.
In federated learning, participating clients typically possess non-i.i.d. data, posing a significant challenge to generalization to unseen distributions. To address this, we propose a Wasserstein distributionally robust optimization scheme called WAFL. Leveraging its duality, we frame WAFL as an empirical surrogate risk minimization problem, and solve it using a local SGD-based algorithm with convergence guarantees. We show that the robustness of WAFL is more general than related approaches, and the generalization bound is robust to all adversarial distributions inside the Wasserstein ball (ambiguity set). Since the center location and radius of the Wasserstein ball can be suitably modified, WAFL shows its applicability not only in robustness but also in domain adaptation. Through empirical evaluation, we demonstrate that WAFL generalizes better than the vanilla FedAvg in non-i.i.d. settings, and is more robust than other related methods in distribution shift settings. Further, using benchmark datasets we show that WAFL is capable of generalizing to unseen target domains.