CLAIIROct 26, 2024

RARe: Retrieval Augmented Retrieval with In-Context Examples

Princeton
arXiv:2410.20088v12 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses a bottleneck in retrieval systems by adapting in-context learning to embedding models, offering incremental improvements with potential for broader application.

The paper tackles the problem of improving embedding model performance in retrieval tasks by enabling them to use in-context examples, similar to decoder-only language models, and achieves performance gains of up to +2.72% nDCG across various open-domain retrieval datasets.

We investigate whether in-context examples, widely used in decoder-only language models (LLMs), can improve embedding model performance in retrieval tasks. Unlike in LLMs, naively prepending in-context examples (query-document pairs) to the target query at inference time does not work out of the box. We introduce a simple approach to enable retrievers to use in-context examples. Our approach, RARe, finetunes a pre-trained model with in-context examples whose query is semantically similar to the target query. This can be applied to adapt various base architectures (i.e., decoder-only language models, retriever models) and consistently achieves performance gains of up to +2.72% nDCG across various open-domain retrieval datasets (BeIR, RAR-b). In particular, we find RARe exhibits stronger out-of-domain generalization compared to models using queries without in-context examples, similar to what is seen for in-context learning in LLMs. We further provide analysis on the design choices of in-context example augmentation and lay the foundation for future work in this space.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes