CLMay 1, 2017

Learning Topic-Sensitive Word Representations

arXiv:1705.00441v121 citations
Originality Incremental advance
AI Analysis

This addresses the limitation of single word representations in NLP for tasks requiring disambiguation, though it is incremental as it builds on existing distributed representation methods.

The paper tackled the problem of ambiguous word meanings in NLP by learning multiple topic-sensitive word representations using Hierarchical Dirichlet Process, resulting in statistically significant improvements for the lexical substitution task.

Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes