LGSep 8, 2017

Multi-level Feedback Web Links Selection Problem: Learning and Optimization

Kechao Cai, Kun Chen, Longbo Huang, John C. S. Lui

arXiv:1709.02664v11.43 citations

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for website optimization, offering a novel approach to link selection with provable guarantees, though it is incremental in applying bandit methods to a new structured setting.

The paper tackles the problem of selecting web links to maximize compound revenue while maintaining a click-through rate threshold, by modeling it as a constrained multi-armed bandit problem with multi-level feedback structures. The proposed LExp algorithm achieves sub-linear regret and violation bounds, outperforming state-of-the-art methods in experiments on real-world datasets.

Selecting the right web links for a website is important because appropriate links not only can provide high attractiveness but can also increase the website's revenue. In this work, we first show that web links have an intrinsic \emph{multi-level feedback structure}. For example, consider a $2$-level feedback web link: the $1$st level feedback provides the Click-Through Rate (CTR) and the $2$nd level feedback provides the potential revenue, which collectively produce the compound $2$-level revenue. We consider the context-free links selection problem of selecting links for a homepage so as to maximize the total compound $2$-level revenue while keeping the total $1$st level feedback above a preset threshold. We further generalize the problem to links with $n~(n\ge2)$-level feedback structure. The key challenge is that the links' multi-level feedback structures are unobservable unless the links are selected on the homepage. To our best knowledge, we are the first to model the links selection problem as a constrained multi-armed bandit problem and design an effective links selection algorithm by learning the links' multi-level structure with provable \emph{sub-linear} regret and violation bounds. We uncover the multi-level feedback structures of web links in two real-world datasets. We also conduct extensive experiments on the datasets to compare our proposed \textbf{LExp} algorithm with two state-of-the-art context-free bandit algorithms and show that \textbf{LExp} algorithm is the most effective in links selection while satisfying the constraint.

View on arXiv PDF

Similar