CLIRLGDec 21, 2014

Extraction of Salient Sentences from Labelled Documents

arXiv:1412.6815v2145 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and automated sentence extraction in document analysis, though it appears incremental as it builds on existing methods.

The paper tackles the problem of extracting topic-relevant sentences from labeled documents by developing a hierarchical convolutional document model that enables introspection of document structure, and it introduces a scalable evaluation technique to avoid human annotation.

We present a hierarchical convolutional document model with an architecture designed to support introspection of the document structure. Using this model, we show how to use visualisation techniques from the computer vision literature to identify and extract topic-relevant sentences. We also introduce a new scalable evaluation technique for automatic sentence extraction systems that avoids the need for time consuming human annotation of validation data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes