CLApr 27, 2023

q2d: Turning Questions into Dialogs to Teach Models How to Search

arXiv:2304.14318v2136 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses the resource-intensive challenge of obtaining training data for search query generation in dialog systems, offering a scalable and controllable solution for improving grounding in information-seeking conversations.

The paper tackles the problem of generating training data for teaching language models to issue search queries in information-seeking dialogs by proposing q2d, an automatic pipeline that converts questions into dialogs using a large language model. The results show that models trained on this synthetic data achieve 90%–97% of the performance of those trained on human-generated data on the QReCC dataset, and it successfully generates data for new domains without existing dialog data.

One of the exciting capabilities of recent language models for dialog is their ability to independently search for relevant information to ground a given dialog response. However, obtaining training data to teach models how to issue search queries is time and resource consuming. In this work, we propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions. We prompt a large language model (PaLM) to create conversational versions of question answering datasets, and use it to improve query generation models that communicate with external search APIs to ground dialog responses. Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale. Our experiments demonstrate that: (1) For query generation on the QReCC dataset, models trained on our synthetically-generated data achieve 90%--97% of the performance of models trained on the human-generated data; (2) We can successfully generate data for training dialog models in new domains without any existing dialog data as demonstrated on the multi-hop MuSiQue and Bamboogle QA datasets. (3) We perform a thorough analysis of the generated dialogs showing that humans find them of high quality and struggle to distinguish them from human-written dialogs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes