Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension
This work addresses the challenge of building models that can comprehend dialogues without task-specific training, which is incremental as it adapts existing pre-training methods to a new data source.
The paper tackles the problem of zero-shot dialogue comprehension by introducing a narrative-guided pre-training strategy that learns from automatically aligned movie subtitles and synopses, achieving superior zero-shot performance and stronger fine-grained capabilities on four dialogue-based tasks.
Comprehending a dialogue requires a model to capture diverse kinds of key information in the utterances, which are either scattered around or implicitly implied in different turns of conversations. Therefore, dialogue comprehension requires diverse capabilities such as paraphrasing, summarizing, and commonsense reasoning. Towards the objective of pre-training a zero-shot dialogue comprehension model, we develop a novel narrative-guided pre-training strategy that learns by narrating the key information from a dialogue input. However, the dialogue-narrative parallel corpus for such a pre-training strategy is currently unavailable. For this reason, we first construct a dialogue-narrative parallel corpus by automatically aligning movie subtitles and their synopses. We then pre-train a BART model on the data and evaluate its performance on four dialogue-based tasks that require comprehension. Experimental results show that our model not only achieves superior zero-shot performance but also exhibits stronger fine-grained dialogue comprehension capabilities. The data and code are available at https://github.com/zhaochaocs/Diana