LGMar 20, 2015

Networked Stochastic Multi-Armed Bandits with Combinatorial Strategies

arXiv:1503.06169v111 citations
Originality Incremental advance
AI Analysis

This work addresses decision-making in networked environments like online social networks, but it is incremental as it builds on existing bandit frameworks with specific extensions.

The paper tackles networked combinatorial bandit problems by extending classical multi-armed bandits to include side bonuses from neighbors, such as in social networks, and presents zero-regret policies for four scenarios involving single-play and combinatorial-play with side observation or reward, validated through simulations.

In this paper, we investigate a largely extended version of classical MAB problem, called networked combinatorial bandit problems. In particular, we consider the setting of a decision maker over a networked bandits as follows: each time a combinatorial strategy, e.g., a group of arms, is chosen, and the decision maker receives a reward resulting from her strategy and also receives a side bonus resulting from that strategy for each arm's neighbor. This is motivated by many real applications such as on-line social networks where friends can provide their feedback on shared content, therefore if we promote a product to a user, we can also collect feedback from her friends on that product. To this end, we consider two types of side bonus in this study: side observation and side reward. Upon the number of arms pulled at each time slot, we study two cases: single-play and combinatorial-play. Consequently, this leaves us four scenarios to investigate in the presence of side bonus: Single-play with Side Observation, Combinatorial-play with Side Observation, Single-play with Side Reward, and Combinatorial-play with Side Reward. For each case, we present and analyze a series of \emph{zero regret} polices where the expect of regret over time approaches zero as time goes to infinity. Extensive simulations validate the effectiveness of our results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes