CLJan 6, 2017

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

arXiv:1701.01574v119 citations
Originality Incremental advance
AI Analysis

This addresses a specific issue in natural language processing for improving word representation accuracy, but it is incremental as it builds on existing multi-sense embedding methods.

The paper tackles the problem of pseudo multi-sense in word embeddings, where multiple vectors for a polysemous word may represent the same meaning, and proposes an algorithm to detect and refine embeddings to eliminate this influence, showing that this improves quality on similarity and analogy tasks.

Previous researches have shown that learning multiple representations for polysemous words can improve the performance of word embeddings on many tasks. However, this leads to another problem. Several vectors of a word may actually point to the same meaning, namely pseudo multi-sense. In this paper, we introduce the concept of pseudo multi-sense, and then propose an algorithm to detect such cases. With the consideration of the detected pseudo multi-sense cases, we try to refine the existing word embeddings to eliminate the influence of pseudo multi-sense. Moreover, we apply our algorithm on previous released multi-sense word embeddings and tested it on artificial word similarity tasks and the analogy task. The result of the experiments shows that diminishing pseudo multi-sense can improve the quality of word representations. Thus, our method is actually an efficient way to reduce linguistic complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes