CLLGNESep 27, 2017

KeyVec: Key-semantics Preserving Document Representations

arXiv:1709.09749v1
Originality Incremental advance
AI Analysis

This addresses the need for better document representations in NLP tasks, but appears incremental as it builds on existing word embedding methods.

The paper tackles the problem of learning distributed representations for text documents that preserve key semantics, proposing KeyVec, a neural network model that retains topics and important information for downstream NLP tasks, and shows superior quality in two document understanding tasks.

Previous studies have demonstrated the empirical success of word embeddings in various applications. In this paper, we investigate the problem of learning distributed representations for text documents which many machine learning algorithms take as input for a number of NLP tasks. We propose a neural network model, KeyVec, which learns document representations with the goal of preserving key semantics of the input text. It enables the learned low-dimensional vectors to retain the topics and important information from the documents that will flow to downstream tasks. Our empirical evaluations show the superior quality of KeyVec representations in two different document understanding tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes