IRCLLGApr 27, 2023

Multivariate Representation Learning for Information Retrieval

arXiv:2304.14522v18 citationsh-index: 43
Originality Incremental advance
AI Analysis

This work addresses a fundamental limitation in information retrieval by enhancing representation learning, though it builds incrementally on existing bi-encoder architectures.

The paper tackles the problem of dense retrieval by proposing a framework that learns multivariate normal distributions instead of vector representations for queries and documents, using negative multivariate KL divergence for similarity computation. Experimental results show significant improvements over competitive dense retrieval models across multiple datasets.

Dense retrieval models use bi-encoder network architectures for learning query and document representations. These representations are often in the form of a vector representation and their similarities are often computed using the dot product function. In this paper, we propose a new representation learning framework for dense retrieval. Instead of learning a vector for each query and document, our framework learns a multivariate distribution and uses negative multivariate KL divergence to compute the similarity between distributions. For simplicity and efficiency reasons, we assume that the distributions are multivariate normals and then train large language models to produce mean and variance vectors for these distributions. We provide a theoretical foundation for the proposed framework and show that it can be seamlessly integrated into the existing approximate nearest neighbor algorithms to perform retrieval efficiently. We conduct an extensive suite of experiments on a wide range of datasets, and demonstrate significant improvements compared to competitive dense retrieval models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes