LGJun 30, 2022

On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network

arXiv:2206.15025v225 citationsh-index: 49
Originality Incremental advance
AI Analysis

This work addresses distributed bilevel optimization for machine learning applications where data is decentralized across a network, representing an incremental advance over single-machine methods.

The authors tackled the problem of distributed bilevel optimization by developing two novel decentralized algorithms that use gradient tracking and different gradient estimators, achieving convergence rates for nonconvex-strongly-convex problems with experimental validation.

Bilevel optimization has been applied to a wide variety of machine learning models, and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of handling the distributed data. To address this issue, under the setting where all participants compose a network and perform peer-to-peer communication in this network, we developed two novel decentralized stochastic bilevel optimization algorithms based on the gradient tracking communication mechanism and two different gradient estimators. Additionally, we established their convergence rates for nonconvex-strongly-convex problems with novel theoretical analysis strategies. To our knowledge, this is the first work achieving these theoretical results. Finally, we applied our algorithms to practical machine learning models, and the experimental results confirmed the efficacy of our algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes