Weak Supervision for Improved Precision in Search Systems
This addresses the challenge of reducing labeling costs for search engine developers, but it appears incremental as it builds on existing weak supervision and Learning to Rank methods.
The paper tackles the problem of costly labeled datasets for search engines by proposing a weak supervision approach to infer query-document pair quality, applying it in a Learning to Rank framework to enhance precision in a large-scale search system.
Labeled datasets are essential for modern search engines, which increasingly rely on supervised learning methods like Learning to Rank and massive amounts of data to power deep learning models. However, creating these datasets is both time-consuming and costly, leading to the common use of user click and activity logs as proxies for relevance. In this paper, we present a weak supervision approach to infer the quality of query-document pairs and apply it within a Learning to Rank framework to enhance the precision of a large-scale search system.