An Unsupervised Semantic Sentence Ranking Scheme for Text Documents
This addresses the need for efficient sentence ranking in text summarization, offering an incremental improvement over existing methods.
The paper tackles the problem of automatically ranking sentences by importance in a single document, presenting Semantic SentenceRank (SSR), an unsupervised scheme that outperforms individual judge rankings on SummBank benchmarks.
This paper presents Semantic SentenceRank (SSR), an unsupervised scheme for automatically ranking sentences in a single document according to their relative importance. In particular, SSR extracts essential words and phrases from a text document, and uses semantic measures to construct, respectively, a semantic phrase graph over phrases and words, and a semantic sentence graph over sentences. It applies two variants of article-structure-biased PageRank to score phrases and words on the first graph and sentences on the second graph. It then combines these scores to generate the final score for each sentence. Finally, SSR solves a multi-objective optimization problem for ranking sentences based on their final scores and topic diversity through semantic subtopic clustering. An implementation of SSR that runs in quadratic time is presented, and it outperforms, on the SummBank benchmarks, each individual judge's ranking and compares favorably with the combined ranking of all judges.