Abstractive Multi-Document Summarization via Phrase Selection and Merging
This addresses the problem of creating concise summaries from multiple documents for users needing quick insights, though it is incremental in improving phrase-based methods.
The paper tackled multi-document summarization by generating new sentences from selected and merged phrases, outperforming state-of-the-art models on the TAC 2011 benchmark under automated pyramid evaluation.
We propose an abstraction-based multi-document summarization framework that can construct new sentences by exploring more fine-grained syntactic units than sentences, namely, noun/verb phrases. Different from existing abstraction-based approaches, our method first constructs a pool of concepts and facts represented by phrases from the input documents. Then new sentences are generated by selecting and merging informative phrases to maximize the salience of phrases and meanwhile satisfy the sentence construction constraints. We employ integer linear optimization for conducting phrase selection and merging simultaneously in order to achieve the global optimal solution for a summary. Experimental results on the benchmark data set TAC 2011 show that our framework outperforms the state-of-the-art models under automated pyramid evaluation metric, and achieves reasonably well results on manual linguistic quality evaluation.