LGCRIRMay 26, 2023

Adversarial Attacks on Online Learning to Rank with Click Feedback

arXiv:2305.17071v17 citations
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in OLTR systems, which are incremental by building on prior work to handle bounded and discrete feedback.

The paper tackles adversarial attacks on online learning to rank (OLTR) algorithms, proposing attack strategies that manipulate learning agents into selecting a target item nearly all the time with minimal cumulative cost, as validated by experiments on synthetic and real data.

Online learning to rank (OLTR) is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. Although potential attacks against OLTR algorithms may cause serious losses in real-world applications, little is known about adversarial attacks on OLTR. This paper studies attack strategies against multiple variants of OLTR. Our first result provides an attack strategy against the UCB algorithm on classical stochastic bandits with binary feedback, which solves the key issues caused by bounded and discrete feedback that previous works can not handle. Building on this result, we design attack algorithms against UCB-based OLTR algorithms in position-based and cascade models. Finally, we propose a general attack strategy against any algorithm under the general click model. Each attack algorithm manipulates the learning agent into choosing the target attack item $T-o(T)$ times, incurring a cumulative cost of $o(T)$. Experiments on synthetic and real data further validate the effectiveness of our proposed attack algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes