LGMLMar 20, 2019

Contextual Bandits with Random Projection

arXiv:1903.08600v11 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency and performance issues in sequential decision-making problems like online advertisements for high-dimensional data, representing an incremental improvement over prior methods.

The paper tackles the challenges of high time-complexity and large regret bounds in high-dimensional contextual bandits by developing the CBRAP algorithm, which uses random projection to reduce dimensionality and achieves a linear upper regret bound associated with reduced dimensions.

Contextual bandits with linear payoffs, which are also known as linear bandits, provide a powerful alternative for solving practical problems of sequential decisions, e.g., online advertisements. In the era of big data, contextual data usually tend to be high-dimensional, which leads to new challenges for traditional linear bandits mostly designed for the setting of low-dimensional contextual data. Due to the curse of dimensionality, there are two challenges in most of the current bandit algorithms: the first is high time-complexity; and the second is extreme large upper regret bounds with high-dimensional data. In this paper, in order to attack the above two challenges effectively, we develop an algorithm of Contextual Bandits via RAndom Projection (\texttt{CBRAP}) in the setting of linear payoffs, which works especially for high-dimensional contextual data. The proposed \texttt{CBRAP} algorithm is time-efficient and flexible, because it enables players to choose an arm in a low-dimensional space, and relaxes the sparsity assumption of constant number of non-zero components in previous work. Besides, we provide a linear upper regret bound for the proposed algorithm, which is associated with reduced dimensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes