LG CVJul 2, 2021

Convolutional Neural Bandit for Visual-aware Recommendation

arXiv:2107.07438v28.46 citations

Originality Incremental advance

AI Analysis

This addresses the exploration-exploitation dilemma in online recommendation and advertising for businesses using image displays, representing an incremental improvement by integrating CNNs into bandit algorithms.

The paper tackles the problem of visual-aware recommendation by proposing a contextual bandit algorithm that uses a convolutional neural network to learn reward functions with an upper confidence bound for exploration, achieving a near-optimal regret bound of $ ilde{\mathcal{O}}(\sqrt{T})$ and outperforming state-of-the-art UCB-based algorithms on real-world image datasets.

Online recommendation/advertising is ubiquitous in web business. Image displaying is considered as one of the most commonly used formats to interact with customers. Contextual multi-armed bandit has shown success in the application of advertising to solve the exploration-exploitation dilemma existing in the recommendation procedure. Inspired by the visual-aware recommendation, in this paper, we propose a contextual bandit algorithm, where the convolutional neural network (CNN) is utilized to learn the reward function along with an upper confidence bound (UCB) for exploration. We also prove a near-optimal regret bound $\tilde{\mathcal{O}}(\sqrt{T})$ when the network is over-parameterized, and establish strong connections with convolutional neural tangent kernel (CNTK). Finally, we evaluate the empirical performance of the proposed algorithm and show that it outperforms other state-of-the-art UCB-based bandit algorithms on real-world image data sets.

View on arXiv PDF

Similar