LGITMLFeb 25, 2021

Federated Multi-armed Bandits with Personalization

arXiv:2102.13101v1101 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of personalized decision-making in distributed systems, representing an incremental advancement in federated bandit algorithms.

The paper tackles the problem of balancing generalization and personalization in federated multi-armed bandits by proposing a new framework and algorithm, achieving an O(log(T)) regret with theoretical and experimental validation.

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization. Under the PF-MAB framework, a mixed bandit learning problem that flexibly balances generalization and personalization is studied. A lower bound analysis for the mixed model is presented. We then propose the Personalized Federated Upper Confidence Bound (PF-UCB) algorithm, where the exploration length is chosen carefully to achieve the desired balance of learning the local model and supplying global information for the mixed learning objective. Theoretical analysis proves that PF-UCB achieves an $O(\log(T))$ regret regardless of the degree of personalization, and has a similar instance dependency as the lower bound. Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and demonstrate the effectiveness of the proposed algorithm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes