Semi-Supervised QA with Generative Domain-Adaptive Nets
This addresses the challenge of limited labeled data for question answering models, though it appears incremental as it builds on existing semi-supervised and domain adaptation techniques.
The paper tackles the problem of semi-supervised question answering by proposing a novel training framework that uses a generative model to create questions from unlabeled text and combines them with human-generated questions, resulting in substantial improvement in performance.
We study the problem of semi-supervised question answering----utilizing unlabeled text to boost the performance of question answering models. We propose a novel training framework, the Generative Domain-Adaptive Nets. In this framework, we train a generative model to generate questions based on the unlabeled text, and combine model-generated questions with human-generated questions for training question answering models. We develop novel domain adaptation algorithms, based on reinforcement learning, to alleviate the discrepancy between the model-generated data distribution and the human-generated data distribution. Experiments show that our proposed framework obtains substantial improvement from unlabeled text.