CLIRMay 23, 2024

AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings

arXiv:2405.15028v123 citationsh-index: 10EMNLP
Originality Incremental advance
AI Analysis

This addresses the need for flexible ranking granularity in search applications, offering a novel method that is incremental in improving existing multi-vector approaches.

The paper tackles the problem of inflexible granularity in ranking algorithms by introducing any-granularity ranking using multi-vector embeddings, achieving improved performance in applications like sentence-level ranking and surpassing prompt-driven citation generation in retrieval-augmented generation.

Ranking is a fundamental and popular problem in search. However, existing ranking algorithms usually restrict the granularity of ranking to full passages or require a specific dense index for each desired level of granularity. Such lack of flexibility in granularity negatively affects many applications that can benefit from more granular ranking, such as sentence-level ranking for open-domain question-answering, or proposition-level ranking for attribution. In this work, we introduce the idea of any-granularity ranking, which leverages multi-vector embeddings to rank at varying levels of granularity while maintaining encoding at a single (coarser) level of granularity. We propose a multi-granular contrastive loss for training multi-vector approaches, and validate its utility with both sentences and propositions as ranking units. Finally, we demonstrate the application of proposition-level ranking to post-hoc citation addition in retrieval-augmented generation, surpassing the performance of prompt-driven citation generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes