IRMar 26

Unbiased Multimodal Reranking for Long-Tail Short-Video Search

arXiv:2603.2497579.0h-index: 6
AI Analysis

This addresses the Matthew effect in long-tail queries for short-video search platforms like Kuaishou, improving content quality and user experience, though it is incremental as it builds on existing LLM advancements.

The paper tackles the problem of low-quality content in long-tail short-video search due to sparse user behavior data by proposing an LLM-driven multimodal reranking framework that estimates user experience without real interactions, achieving improvements in offline metrics like AUC and NDCG@K and online gains in user experience and consumption metrics in A/B tests.

Kuaishou serving hundreds of millions of searches daily, the quality of short-video search is paramount. However, it suffers from a severe Matthew effect on long-tail queries: sparse user behavior data causes models to amplify low-quality content such as clickbait and shallow content. The recent advancements in Large Language Models (LLMs) offer a new paradigm, as their inherent world knowledge provides a powerful mechanism to assess content quality, agnostic to sparse user interactions. To this end, we propose a LLM-driven multimodal reranking framework, which estimates user experience without real user behavior. The approach involves a two-stage training process: the first stage uses multimodal evidence to construct high-quality annotations for supervised fine-tuning, while the second stage incorporates pairwise preference optimization to help the model learn partial orderings among candidates. At inference time, the resulting experience scores are used to promote high-quality but underexposed videos in reranking, and further guide page-level optimization through reinforcement learning. Experiments show that the proposed method achieves consistent improvements over strong baselines in offline metrics including AUC, NDCG@K, and human preference judgement. An online A/B test covering 15\% of traffic further demonstrates gains in both user experience and consumption metrics, confirming the practical value of the approach in long-tail video search scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes