CLAIAug 16, 2022

Parallel Hierarchical Transformer with Attention Alignment for Abstractive Multi-Document Summarization

arXiv:2208.07845v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating high-quality summaries from multiple lengthy documents, which is incremental as it builds on existing Transformer methods.

The paper tackled abstractive multi-document summarization by proposing a Parallel Hierarchical Transformer with attention alignment, which improved summary quality as shown by higher ROUGE scores and human evaluations compared to Transformer-based baselines at low computational cost.

In comparison to single-document summarization, abstractive Multi-Document Summarization (MDS) brings challenges on the representation and coverage of its lengthy and linked sources. This study develops a Parallel Hierarchical Transformer (PHT) with attention alignment for MDS. By incorporating word- and paragraph-level multi-head attentions, the hierarchical architecture of PHT allows better processing of dependencies at both token and document levels. To guide the decoding towards a better coverage of the source documents, the attention-alignment mechanism is then introduced to calibrate beam search with predicted optimal attention distributions. Based on the WikiSum data, a comprehensive evaluation is conducted to test improvements on MDS by the proposed architecture. By better handling the inner- and cross-document information, results in both ROUGE and human evaluation suggest that our hierarchical model generates summaries of higher quality relative to other Transformer-based baselines at relatively low computational cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes