FedHB: Hierarchical Bayesian Federated Learning
This work addresses privacy and efficiency challenges in federated learning for distributed systems, offering a novel theoretical framework with practical implications.
The paper tackles the problem of federated learning by proposing a hierarchical Bayesian model that describes client data generation, resulting in a distributed algorithm that preserves privacy and subsumes existing methods like Fed-Avg and Fed-Prox, with convergence rates matching centralized SGD and asymptotic optimality in generalization error.
We propose a novel hierarchical Bayesian approach to Federated Learning (FL), where our model reasonably describes the generative process of clients' local data via hierarchical Bayesian modeling: constituting random variables of local models for clients that are governed by a higher-level global variate. Interestingly, the variational inference in our Bayesian model leads to an optimisation problem whose block-coordinate descent solution becomes a distributed algorithm that is separable over clients and allows them not to reveal their own private data at all, thus fully compatible with FL. We also highlight that our block-coordinate algorithm has particular forms that subsume the well-known FL algorithms including Fed-Avg and Fed-Prox as special cases. Beyond introducing novel modeling and derivations, we also offer convergence analysis showing that our block-coordinate FL algorithm converges to an (local) optimum of the objective at the rate of $O(1/\sqrt{t})$, the same rate as regular (centralised) SGD, as well as the generalisation error analysis where we prove that the test error of our model on unseen data is guaranteed to vanish as we increase the training data size, thus asymptotically optimal.