IRDec 12, 2019

SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval

arXiv:1912.05891v2148 citations
Originality Highly original
AI Analysis

This addresses the challenge of capturing cross-document interactions while maintaining permutation invariance in ranking models for information retrieval, representing a novel method rather than an incremental improvement.

The paper tackles the problem of learning-to-rank for information retrieval by proposing SetRank, a model that directly learns a permutation-invariant ranking model for document sets, which significantly outperformed traditional and state-of-the-art neural IR models on three large-scale benchmarks.

In learning-to-rank for information retrieval, a ranking model is automatically learned from the data and then utilized to rank the sets of retrieved documents. Therefore, an ideal ranking model would be a mapping from a document set to a permutation on the set, and should satisfy two critical requirements: (1)~it should have the ability to model cross-document interactions so as to capture local context information in a query; (2)~it should be permutation-invariant, which means that any permutation of the inputted documents would not change the output ranking. Previous studies on learning-to-rank either design uni-variate scoring functions that score each document separately, and thus failed to model the cross-document interactions; or construct multivariate scoring functions that score documents sequentially, which inevitably sacrifice the permutation invariance requirement. In this paper, we propose a neural learning-to-rank model called SetRank which directly learns a permutation-invariant ranking model defined on document sets of any size. SetRank employs a stack of (induced) multi-head self attention blocks as its key component for learning the embeddings for all of the retrieved documents jointly. The self-attention mechanism not only helps SetRank to capture the local context information from cross-document interactions, but also to learn permutation-equivariant representations for the inputted documents, which therefore achieving a permutation-invariant ranking model. Experimental results on three large scale benchmarks showed that the SetRank significantly outperformed the baselines include the traditional learning-to-rank models and state-of-the-art Neural IR models.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes