Online Distillation for Pseudo-Relevance Feedback
This addresses the problem of enhancing retrieval efficiency and effectiveness in search systems for users, though it appears incremental as it builds on existing distillation and feedback methods.
The paper tackles the problem of improving neural search models by introducing online distillation for pseudo-relevance feedback, where a lexical model distilled from neural re-ranking results replicates the re-ranking and enables efficient second-stage retrieval, showing favorable performance compared to established techniques.
Model distillation has emerged as a prominent technique to improve neural search models. To date, distillation taken an offline approach, wherein a new neural model is trained to predict relevance scores between arbitrary queries and documents. In this paper, we explore a departure from this offline distillation strategy by investigating whether a model for a specific query can be effectively distilled from neural re-ranking results (i.e., distilling in an online setting). Indeed, we find that a lexical model distilled online can reasonably replicate the re-ranking of a neural model. More importantly, these models can be used as queries that execute efficiently on indexes. This second retrieval stage can enrich the pool of documents for re-ranking by identifying documents that were missed in the first retrieval stage. Empirically, we show that this approach performs favourably when compared with established pseudo relevance feedback techniques, dense retrieval methods, and sparse-dense ensemble "hybrid" approaches.