Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization
This addresses the problem of summarizing conversations, which is under-investigated compared to structured text, for applications in human-human or human-machine interaction.
The paper tackles abstractive dialogue summarization by proposing a multi-view sequence-to-sequence model that extracts conversational structures from different views to represent conversations and uses a multi-view decoder to generate summaries. Experiments on a large-scale corpus showed it significantly outperformed previous state-of-the-art models in automatic evaluations and human judgment.
Text summarization is one of the most challenging and interesting problems in NLP. Although much attention has been paid to summarizing structured text like news reports or encyclopedia articles, summarizing conversations---an essential part of human-human/machine interaction where most important pieces of information are scattered across various utterances of different speakers---remains relatively under-investigated. This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations and then utilizing a multi-view decoder to incorporate different views to generate dialogue summaries. Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment. We also discussed specific challenges that current approaches faced with this task. We have publicly released our code at https://github.com/GT-SALT/Multi-View-Seq2Seq.