OC LGMar 8, 2023

Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient

arXiv:2303.04560v17.66 citationsh-index: 28Has Code

Originality Incremental advance

AI Analysis

This work addresses the vulnerability of distributed optimization to malicious or faulty participants, which is crucial for collaborative problem-solving among small groups, companies, and individuals, though it is incremental as it extends existing variance reduction techniques to a new estimator.

The authors tackled the problem of distributed optimization with Byzantine workers by proposing a new method, BR-LSVRG, which closes a gap in the literature by extending SVRG-type variance reduction to Byzantine-robust settings, achieving non-asymptotic convergence guarantees in strongly convex cases and showing competitive performance in numerical experiments.

Distributed optimization with open collaboration is a popular field since it provides an opportunity for small groups/companies/universities, and individuals to jointly solve huge-scale problems. However, standard optimization algorithms are fragile in such settings due to the possible presence of so-called Byzantine workers -- participants that can send (intentionally or not) incorrect information instead of the one prescribed by the protocol (e.g., send anti-gradient instead of stochastic gradients). Thus, the problem of designing distributed methods with provable robustness to Byzantine workers has been receiving a lot of attention recently. In particular, several works consider a very promising way to achieve Byzantine tolerance via exploiting variance reduction and robust aggregation. The existing approaches use SAGA- and SARAH-type variance-reduced estimators, while another popular estimator -- SVRG -- is not studied in the context of Byzantine-robustness. In this work, we close this gap in the literature and propose a new method -- Byzantine-Robust Loopless Stochastic Variance Reduced Gradient (BR-LSVRG). We derive non-asymptotic convergence guarantees for the new method in the strongly convex case and compare its performance with existing approaches in numerical experiments.

View on arXiv PDF Code

Similar