CLApr 24, 2017

Watset: Automatic Induction of Synsets from a Graph of Synonyms

arXiv:1704.07157v146 citations
Originality Incremental advance
AI Analysis

This addresses the problem of creating synsets for natural language processing tasks, offering a method that improves upon existing approaches, though it is incremental in nature.

The paper tackles the problem of automatically inducing synsets from synonymy dictionaries and word embeddings by building a weighted graph, applying word sense induction, and clustering the disambiguated graph. It shows excellent results, outperforming five state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian.

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clustering approach lets us use an efficient hard clustering algorithm to perform a fuzzy clustering of the graph. Despite its simplicity, our approach shows excellent results, outperforming five competitive state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian derived from large-scale manually constructed lexical resources.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes