Bandit Algorithm Driven by a Classical Random Walk and a Quantum Walk
This work addresses multi-armed bandit problems for optimization and decision-making applications, presenting an incremental improvement by applying quantum walks to a known framework.
The paper tackled the multi-armed bandit problem by proposing algorithms based on classical random walks and quantum walks, showing that the quantum walk-based model achieves higher performance under certain settings by linking exploration and exploitation to quantum walk behaviors.
Quantum walks (QWs) have a property that classical random walks (RWs) do not possess -- the coexistence of linear spreading and localization -- and this property is utilized to implement various kinds of applications. This paper proposes RW- and QW-based algorithms for multi-armed-bandit (MAB) problems. We show that, under some settings, the QW-based model realizes higher performance than the corresponding RW-based one by associating the two operations that make MAB problems difficult -- exploration and exploitation -- with these two behaviors of QWs.