IR CLFeb 19, 2024

Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation

Chanwoong Yoon, Gangwoo Kim, Byeongguk Jeon, Sungdong Kim, Yohan Jo, Jaewoo Kang

arXiv:2402.11827v223.831 citationsh-index: 49NAACL

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving conversational search for users by enhancing query reformulation, representing an incremental advance in aligning language models with retrieval systems.

The paper tackles the problem of sub-optimal query rewrites in conversational search by introducing RetPO, a framework that optimizes language models to align with retriever preferences, resulting in a model that surpasses previous state-of-the-art performance on two benchmarks.

Conversational search, unlike single-turn retrieval tasks, requires understanding the current question within a dialogue context. The common approach of rewrite-then-retrieve aims to decontextualize questions to be self-sufficient for off-the-shelf retrievers, but most existing methods produce sub-optimal query rewrites due to the limited ability to incorporate signals from the retrieval results. To overcome this limitation, we present a novel framework RetPO (Retriever's Preference Optimization), which is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems. The process begins by prompting a large LM to produce various potential rewrites and then collects retrieval performance for these rewrites as the retrievers' preferences. Through the process, we construct a large-scale dataset called RF collection, containing Retrievers' Feedback on over 410K query rewrites across 12K conversations. Furthermore, we fine-tune a smaller LM on this dataset to align it with the retrievers' feedback. Our resulting model demonstrates superiority on two benchmarks, surpassing the previous state-of-the-art performance of rewrite-then-retrieve approaches.

View on arXiv PDF

Similar