CL AIMar 12, 2023

Compressed Heterogeneous Graph for Abstractive Multi-Document Summarization

arXiv:2303.06565v10.914 citationsh-index: 36Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of generating summaries from multiple documents for users needing concise information, but it is incremental as it builds on existing encoder-decoder architectures with graph enhancements.

The authors tackled multi-document summarization by proposing HGSUM, a model that uses a compressed heterogeneous graph to represent semantic units, and it outperformed state-of-the-art models on datasets like MULTI-NEWS, WCEP-100, and ARXIV.

Multi-document summarization (MDS) aims to generate a summary for a number of related documents. We propose HGSUM, an MDS model that extends an encoder-decoder architecture, to incorporate a heterogeneous graph to represent different semantic units (e.g., words and sentences) of the documents. This contrasts with existing MDS models which do not consider different edge types of graphs and as such do not capture the diversity of relationships in the documents. To preserve only key information and relationships of the documents in the heterogeneous graph, HGSUM uses graph pooling to compress the input graph. And to guide HGSUM to learn compression, we introduce an additional objective that maximizes the similarity between the compressed graph and the graph constructed from the ground-truth summary during training. HGSUM is trained end-to-end with graph similarity and standard cross-entropy objectives. Experimental results over MULTI-NEWS, WCEP-100, and ARXIV show that HGSUM outperforms state-of-the-art MDS models. The code for our model and experiments is available at: https://github.com/oaimli/HGSum.

View on arXiv PDF Code

Similar