Conversational QA Dataset Generation with Answer Revision
This work addresses the need for scalable and accurate dataset generation for conversational AI, though it is incremental as it builds on existing methods with a focus on answer revision.
The paper tackles the problem of generating high-quality conversational question-answering datasets by introducing a framework that extracts phrases, generates questions, and revises answers to ensure exact matches, resulting in significant improvements in synthetic data quality and effective domain adaptation.
Conversational question--answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.