Observation-Free Attacks on Online Learning to Rank
This addresses a critical security problem for search engines and content recommenders using OLTR, representing a novel attack paradigm rather than an incremental improvement.
The paper tackles the vulnerability of online learning to rank (OLTR) algorithms to adversarial attacks by introducing a framework that promotes target items to top recommendations for most rounds while causing linear regret, with theoretical guarantees requiring only O(log T) manipulations and empirical validation on real-world data.
Online learning to rank (OLTR) plays a critical role in information retrieval and machine learning systems, with a wide range of applications in search engines and content recommenders. However, despite their extensive adoption, the susceptibility of OLTR algorithms to coordinated adversarial attacks remains poorly understood. In this work, we present a novel framework for attacking some of the widely used OLTR algorithms. Our framework is designed to promote a set of target items so that they appear in the list of top-K recommendations for T - o(T) rounds, while simultaneously inducing linear regret in the learning algorithm. We propose two novel attack strategies: CascadeOFA for CascadeUCB1 and PBMOFA for PBM-UCB . We provide theoretical guarantees showing that both strategies require only O(log T) manipulations to succeed. Additionally, we supplement our theoretical analysis with empirical results on real-world data.