IRAIMay 8, 2012

Document summarization using positive pointwise mutual information

arXiv:1205.1638v130 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for document summarization tasks, potentially aiding in processing large documents more effectively.

The paper tackles document summarization by using Positive Pointwise Mutual Information to weight a Term-Sentence-Matrix for identifying significant sentences, and reports that this method outperforms most existing methods for large documents.

The degree of success in document summarization processes depends on the performance of the method used in identifying significant sentences in the documents. The collection of unique words characterizes the major signature of the document, and forms the basis for Term-Sentence-Matrix (TSM). The Positive Pointwise Mutual Information, which works well for measuring semantic similarity in the Term-Sentence-Matrix, is used in our method to assign weights for each entry in the Term-Sentence-Matrix. The Sentence-Rank-Matrix generated from this weighted TSM, is then used to extract a summary from the document. Our experiments show that such a method would outperform most of the existing methods in producing summaries from large documents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes