LG OC MLNov 29, 2023

Federated Online and Bandit Convex Optimization

Kumar Kshitij Patel, Lingxiao Wang, Aadirupa Saha, Nati Sebro

arXiv:2311.17586v113.014 citationsh-index: 73

Originality Highly original

AI Analysis

This work addresses the problem of distributed optimization under limited feedback for researchers in machine learning, providing foundational insights into when collaboration is effective in adversarial versus stochastic settings.

The paper tackles federated online and bandit convex optimization with adaptive adversaries, showing that collaboration is not beneficial with first-order gradient feedback but can lead to linear speedup in high-dimensional regimes with zeroth-order feedback, achieving tight regret bounds in intermittent communication settings.

We study the problems of distributed online and bandit convex optimization against an adaptive adversary. We aim to minimize the average regret on $M$ machines working in parallel over $T$ rounds with $R$ intermittent communications. Assuming the underlying cost functions are convex and can be generated adaptively, our results show that collaboration is not beneficial when the machines have access to the first-order gradient information at the queried points. This is in contrast to the case for stochastic functions, where each machine samples the cost functions from a fixed distribution. Furthermore, we delve into the more challenging setting of federated online optimization with bandit (zeroth-order) feedback, where the machines can only access values of the cost functions at the queried points. The key finding here is identifying the high-dimensional regime where collaboration is beneficial and may even lead to a linear speedup in the number of machines. We further illustrate our findings through federated adversarial linear bandits by developing novel distributed single and two-point feedback algorithms. Our work is the first attempt towards a systematic understanding of federated online optimization with limited feedback, and it attains tight regret bounds in the intermittent communication setting for both first and zeroth-order feedback. Our results thus bridge the gap between stochastic and adaptive settings in federated online optimization.

View on arXiv PDF

Similar