Bo Li

9.8LGMay 30, 2023Code

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

Renzhe Xu, Haotian Wang, Xingxuan Zhang et al.

Competitions for shareable and limited resources have long been studied with strategic agents. In reality, agents often have to learn and maximize the rewards of the resources at the same time. To design an individualized competing policy, we model the competition between agents in a novel multi-player multi-armed bandit (MPMAB) setting where players are selfish and aim to maximize their own rewards. In addition, when several players pull the same arm, we assume that these players averagely share the arms' rewards by expectation. Under this setting, we first analyze the Nash equilibrium when arms' rewards are known. Subsequently, we propose a novel Selfish MPMAB with Averaging Allocation (SMAA) approach based on the equilibrium. We theoretically demonstrate that SMAA could achieve a good regret guarantee for each player when all players follow the algorithm. Additionally, we establish that no single selfish player can significantly increase their rewards through deviation, nor can they detrimentally affect other players' rewards without incurring substantial losses for themselves. We finally validate the effectiveness of the method in extensive synthetic experiments.

1.2DCJun 8, 2017

Clique Gossiping

Yang Liu, Bo Li, Brian Anderson et al.

This paper proposes and investigates a framework for clique gossip protocols. As complete subnetworks, the existence of cliques is ubiquitous in various social, computer, and engineering networks. By clique gossiping, nodes interact with each other along a sequence of cliques. Clique-gossip protocols are defined as arbitrary linear node interactions where node states are vectors evolving as linear dynamical systems. Such protocols become clique-gossip averaging algorithms when node states are scalars under averaging rules. We generalize the classical notion of line graph to capture the essential node interaction structure induced by both the underlying network and the specific clique sequence. We prove a fundamental eigenvalue invariance principle for periodic clique-gossip protocols, which implies that any permutation of the clique sequence leads to the same spectrum for the overall state transition when the generalized line graph contains no cycle. We also prove that for a network with $n$ nodes, cliques with smaller sizes determined by factors of $n$ can always be constructed leading to finite-time convergent clique-gossip averaging algorithms, provided $n$ is not a prime number. Particularly, such finite-time convergence can be achieved with cliques of equal size $m$ if and only if $n$ is divisible by $m$ and they have exactly the same prime factors. A proven fastest finite-time convergent clique-gossip algorithm is constructed for clique-gossiping using size-$m$ cliques. Additionally, the acceleration effects of clique-gossiping are illustrated via numerical examples.

Bo Li

2 Papers