IRCLOct 4, 2021

A Proposed Conceptual Framework for a Representational Approach to Information Retrieval

arXiv:2110.01529v265 citations
Originality Synthesis-oriented
AI Analysis

This provides a unified theoretical foundation for researchers in information retrieval and NLP, though it is incremental as it organizes existing ideas rather than introducing new techniques.

The paper proposes a conceptual framework that unifies dense and sparse retrieval methods in information retrieval and natural language processing by breaking the problem into logical scoring and physical retrieval models, and shows how existing methods fit within this framework to suggest open research questions.

This paper outlines a conceptual framework for understanding recent developments in information retrieval and natural language processing that attempts to integrate dense and sparse retrieval methods. I propose a representational approach that breaks the core text retrieval problem into a logical scoring model and a physical retrieval model. The scoring model is defined in terms of encoders, which map queries and documents into a representational space, and a comparison function that computes query-document scores. The physical retrieval model defines how a system produces the top-$k$ scoring documents from an arbitrarily large corpus with respect to a query. The scoring model can be further analyzed along two dimensions: dense vs. sparse representations and supervised (learned) vs. unsupervised approaches. I show that many recently proposed retrieval methods, including multi-stage ranking designs, can be seen as different parameterizations in this framework, and that a unified view suggests a number of open research questions, providing a roadmap for future work. As a bonus, this conceptual framework establishes connections to sentence similarity tasks in natural language processing and information access "technologies" prior to the dawn of computing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes