Abstractive Summarization Improved by WordNet-based Extractive Sentences
This work addresses the problem of generating more semantically relevant summaries for natural language processing applications, but it is incremental as it builds on existing seq2seq and extractive methods.
The paper tackled improving abstractive summarization by integrating WordNet-based extractive sentences to enhance semantic relevance, achieving competitive ROUGE scores on the CNN/Daily Mail dataset and high semantic relevance in human evaluations.
Recently, the seq2seq abstractive summarization models have achieved good results on the CNN/Daily Mail dataset. Still, how to improve abstractive methods with extractive methods is a good research direction, since extractive methods have their potentials of exploiting various efficient features for extracting important sentences in one text. In this paper, in order to improve the semantic relevance of abstractive summaries, we adopt the WordNet based sentence ranking algorithm to extract the sentences which are most semantically to one text. Then, we design a dual attentional seq2seq framework to generate summaries with consideration of the extracted information. At the same time, we combine pointer-generator and coverage mechanisms to solve the problems of out-of-vocabulary (OOV) words and duplicate words which exist in the abstractive models. Experiments on the CNN/Daily Mail dataset show that our models achieve competitive performance with the state-of-the-art ROUGE scores. Human evaluations also show that the summaries generated by our models have high semantic relevance to the original text.