CLMay 14, 2019

Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

arXiv:1905.05677v3864 citations
Originality Incremental advance
AI Analysis

This work addresses the data scarcity problem in neural word sense disambiguation for natural language processing applications, offering an incremental improvement through vocabulary compression and enhanced performance.

The paper tackles the limited manually annotated corpora for word sense disambiguation by compressing the sense vocabulary of Princeton WordNet using semantic relationships, reducing model size and improving coverage without additional training data or loss in precision, and presents a BERT-based system that significantly outperforms state-of-the-art results on all WSD evaluation tasks.

In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database. We propose two different methods that greatly reduces the size of neural WSD models, with the benefit of improving their coverage without additional training data, and without impacting their precision. In addition to our method, we present a WSD system which relies on pre-trained BERT word vectors in order to achieve results that significantly outperform the state of the art on all WSD evaluation tasks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes