SY SY OCJan 19, 2018

Restless Bandits with Constrained Arms: Applications in Social and Information Networks

Varun Mehta, Rahul Meshram, Kesav Kaza, S. N. Merchant

arXiv:1801.03634h-index: 24

AI Analysis

This work addresses the problem of efficient information gathering in dynamic social networks for decision-making agents, but the contribution is incremental as it applies existing RMAB techniques to a specific domain.

The paper formulates information gathering in social networks as a restless multi-armed bandit problem with partially observable states and proposes a Whittle index policy to maximize long-term cumulative reward. Numerical simulations show the index policy outperforms myopic and uniform random policies.

We study a problem of information gathering in a social network with dynamically available sources and time varying quality of information. We formulate this problem as a restless multi-armed bandit (RMAB). In this problem, information quality of a source corresponds to the state of an arm in RMAB. The decision making agent does not know the quality of information from sources a priori. But the agent maintains a belief about the quality of information from each source. This is a problem of RMAB with partially observable states. The objective of the agent is to gather relevant information efficiently from sources by contacting them. We formulate this as a infinite horizon discounted reward problem, where reward depends on quality of information. We study Whittle's index policy which determines the sequence of play of arms that maximizes long term cumulative reward. We illustrate the performance of index policy, myopic policy and compare with uniform random policy through numerical simulation.

View on arXiv PDF

Similar