Exponentiated Gradient LINUCB for Contextual Multi-Armed Bandits
This work addresses the challenge of efficient exploration in contextual bandits for applications like online recommendation systems, but appears incremental as it builds on existing LINUCB methods.
The paper tackled the problem of improving exploration in contextual multi-armed bandits by proposing Exponentiated Gradient LINUCB, which outperformed surveyed algorithms in evaluations using real online event log data.
We present Exponentiated Gradient LINUCB, an algorithm for con-textual multi-armed bandits. This algorithm uses Exponentiated Gradient to find the optimal exploration of the LINUCB. Within a deliberately designed offline simulation framework we conduct evaluations with real online event log data. The experimental results demonstrate that our algorithm outperforms surveyed algorithms.