IRCLFeb 19, 2024

Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation

arXiv:2402.11827v231 citationsh-index: 19NAACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving conversational search for users by enhancing query reformulation, representing an incremental advance in aligning language models with retrieval systems.

The paper tackles the problem of sub-optimal query rewrites in conversational search by introducing RetPO, a framework that optimizes language models to align with retriever preferences, resulting in a model that surpasses previous state-of-the-art performance on two benchmarks.

Conversational search, unlike single-turn retrieval tasks, requires understanding the current question within a dialogue context. The common approach of rewrite-then-retrieve aims to decontextualize questions to be self-sufficient for off-the-shelf retrievers, but most existing methods produce sub-optimal query rewrites due to the limited ability to incorporate signals from the retrieval results. To overcome this limitation, we present a novel framework RetPO (Retriever's Preference Optimization), which is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems. The process begins by prompting a large LM to produce various potential rewrites and then collects retrieval performance for these rewrites as the retrievers' preferences. Through the process, we construct a large-scale dataset called RF collection, containing Retrievers' Feedback on over 410K query rewrites across 12K conversations. Furthermore, we fine-tune a smaller LM on this dataset to align it with the retrievers' feedback. Our resulting model demonstrates superiority on two benchmarks, surpassing the previous state-of-the-art performance of rewrite-then-retrieve approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes