CLAINov 9, 2023

Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources

arXiv:2311.07589v1136 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the data scarcity issue for researchers and practitioners in Conversational QA, though it is incremental as it builds upon existing dialog inpainting methods.

The authors tackled the problem of data scarcity in Conversational Question Answering (ConvQA) by proposing Dialogizer, a framework that automatically generates high-quality ConvQA datasets from textual sources, resulting in datasets with higher contextual relevance compared to baseline methods as validated by automatic and human evaluations.

To address the data scarcity issue in Conversational question answering (ConvQA), a dialog inpainting method, which utilizes documents to generate ConvQA datasets, has been proposed. However, the original dialog inpainting model is trained solely on the dialog reconstruction task, resulting in the generation of questions with low contextual relevance due to insufficient learning of question-answer alignment. To overcome this limitation, we propose a novel framework called Dialogizer, which has the capability to automatically generate ConvQA datasets with high contextual relevance from textual sources. The framework incorporates two training tasks: question-answer matching (QAM) and topic-aware dialog generation (TDG). Moreover, re-ranking is conducted during the inference phase based on the contextual relevance of the generated questions. Using our framework, we produce four ConvQA datasets by utilizing documents from multiple domains as the primary source. Through automatic evaluation using diverse metrics, as well as human evaluation, we validate that our proposed framework exhibits the ability to generate datasets of higher quality compared to the baseline dialog inpainting model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes