Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
This work addresses the problem of generating coherent and concise summaries for natural language processing applications, representing an incremental improvement over existing methods.
The paper tackled single-document summarization by developing a discriminative model that integrates compression and anaphoricity constraints, resulting in a system that outperforms prior work on ROUGE scores and human judgments of linguistic quality.
We present a discriminative model for single-document summarization that integrally combines compression and anaphoricity constraints. Our model selects textual units to include in the summary based on a rich set of sparse features whose weights are learned on a large corpus. We allow for the deletion of content within a sentence when that deletion is licensed by compression rules; in our framework, these are implemented as dependencies between subsentential units of text. Anaphoricity constraints then improve cross-sentence coherence by guaranteeing that, for each pronoun included in the summary, the pronoun's antecedent is included as well or the pronoun is rewritten as a full mention. When trained end-to-end, our final system outperforms prior work on both ROUGE as well as on human judgments of linguistic quality.