CLJul 18, 2019

Evaluating the Utility of Document Embedding Vector Difference for Relation Learning

arXiv:1907.08184v14 citations
Originality Synthesis-oriented
AI Analysis

This work addresses document relation learning for NLP applications, but it is incremental as it extends word-level ideas to documents with limited success.

The paper tackled the problem of using document embedding vector differences for relation learning, showing they are useful for document-level similarity tasks like duplicate detection but perform poorly in multi-relational classification.

Recent work has demonstrated that vector offsets obtained by subtracting pretrained word embedding vectors can be used to predict lexical relations with surprising accuracy. Inspired by this finding, in this paper, we extend the idea to the document level, in generating document-level embeddings, calculating the distance between them, and using a linear classifier to classify the relation between the documents. In the context of duplicate detection and dialogue act tagging tasks, we show that document-level difference vectors have utility in assessing document-level similarity, but perform less well in multi-relational classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes