IRJun 25, 2021

A Modern Perspective on Query Likelihood with Deep Generative Retrieval Models

arXiv:2106.13618v121 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more interpretable and uncertainty-aware retrieval systems in information retrieval, though it is incremental as it builds on classical generative models with modern neural architectures.

The paper tackles the problem of document retrieval by introducing deep generative retrieval models that estimate relevance through query term generation probabilities, offering a probabilistic alternative to existing matching-based neural rankers. The proposed T-PGN model significantly outperforms other generative models on MS MARCO and TREC Deep Learning 2019 passage re-ranking tasks, and leveraging its uncertainty information improves cut-off prediction.

Existing neural ranking models follow the text matching paradigm, where document-to-query relevance is estimated through predicting the matching score. Drawing from the rich literature of classical generative retrieval models, we introduce and formalize the paradigm of deep generative retrieval models defined via the cumulative probabilities of generating query terms. This paradigm offers a grounded probabilistic view on relevance estimation while still enabling the use of modern neural architectures. In contrast to the matching paradigm, the probabilistic nature of generative rankers readily offers a fine-grained measure of uncertainty. We adopt several current neural generative models in our framework and introduce a novel generative ranker (T-PGN), which combines the encoding capacity of Transformers with the Pointer Generator Network model. We conduct an extensive set of evaluation experiments on passage retrieval, leveraging the MS MARCO Passage Re-ranking and TREC Deep Learning 2019 Passage Re-ranking collections. Our results show the significantly higher performance of the T-PGN model when compared with other generative models. Lastly, we demonstrate that exploiting the uncertainty information of deep generative rankers opens new perspectives to query/collection understanding, and significantly improves the cut-off prediction task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes