CLSep 12, 2018

Semantic WordRank: Generating Finer Single-Document Summarizations

arXiv:1809.04649v12 citations
Originality Highly original
AI Analysis

This work addresses the need for more accurate and diverse single-document summarization, offering a novel approach that achieves competitive results against both automated and human benchmarks.

The authors tackled the problem of generating extractive single-document summaries by introducing Semantic WordRank (SWR), an unsupervised method that outperformed state-of-the-art algorithms on DUC-02 and surpassed individual human annotators on SummBank under ROUGE measures.

We present Semantic WordRank (SWR), an unsupervised method for generating an extractive summary of a single document. Built on a weighted word graph with semantic and co-occurrence edges, SWR scores sentences using an article-structure-biased PageRank algorithm with a Softplus function adjustment, and promotes topic diversity using spectral subtopic clustering under the Word-Movers-Distance metric. We evaluate SWR on the DUC-02 and SummBank datasets and show that SWR produces better summaries than the state-of-the-art algorithms over DUC-02 under common ROUGE measures. We then show that, under the same measures over SummBank, SWR outperforms each of the three human annotators (aka. judges) and compares favorably with the combined performance of all judges.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes