CLAIJan 20, 2016

Semantic Word Clusters Using Signed Normalized Graph Cuts

arXiv:1601.05403v19 citations
Originality Incremental advance
AI Analysis

This work addresses a specific issue in natural language processing for improving word similarity and sentiment analysis, representing an incremental advance.

The authors tackled the problem of vector space representations grouping antonyms with synonyms by introducing a signed spectral normalized graph cut algorithm that incorporates thesauri to represent antonyms with negative weights, resulting in clusters that better align with human similarity judgments and improve sentiment prediction.

Vector space representations of words capture many aspects of word similarity, but such methods tend to make vector spaces in which antonyms (as well as synonyms) are close to each other. We present a new signed spectral normalized graph cut algorithm, signed clustering, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights. Our signed clustering algorithm produces clusters of words which simultaneously capture distributional and synonym relations. We evaluate these clusters against the SimLex-999 dataset (Hill et al.,2014) of human judgments of word pair similarities, and also show the benefit of using our clusters to predict the sentiment of a given text.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes