Achieving Linear Speedup in Non-IID Federated Bilevel Learning
This addresses a key theoretical gap in federated learning for non-i.i.d. settings, enabling more efficient distributed optimization in applications like machine learning and communications.
The paper tackles the challenge of achieving linear speedup in federated bilevel optimization with non-i.i.d. data, proposing FedMBO with a novel client sampling scheme and demonstrating a convergence rate of O(1/√(nK) + 1/K + √n/K^(3/2)), where n is the number of clients per round and K is the total iterations.
Federated bilevel optimization has received increasing attention in various emerging machine learning and communication applications. Recently, several Hessian-vector-based algorithms have been proposed to solve the federated bilevel optimization problem. However, several important properties in federated learning such as the partial client participation and the linear speedup for convergence (i.e., the convergence rate and complexity are improved linearly with respect to the number of sampled clients) in the presence of non-i.i.d.~datasets, still remain open. In this paper, we fill these gaps by proposing a new federated bilevel algorithm named FedMBO with a novel client sampling scheme in the federated hypergradient estimation. We show that FedMBO achieves a convergence rate of $\mathcal{O}\big(\frac{1}{\sqrt{nK}}+\frac{1}{K}+\frac{\sqrt{n}}{K^{3/2}}\big)$ on non-i.i.d.~datasets, where $n$ is the number of participating clients in each round, and $K$ is the total number of iteration. This is the first theoretical linear speedup result for non-i.i.d.~federated bilevel optimization. Extensive experiments validate our theoretical results and demonstrate the effectiveness of our proposed method.