CLMay 6, 2020

Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders

arXiv:2005.02590v2184 citations
AI Analysis

This addresses the long-tail distribution issue in WSD for NLP applications, though it is an incremental improvement focused on rare senses.

The paper tackled the problem of poor performance on rare or unseen word senses in Word Sense Disambiguation by proposing a bi-encoder model that embeds context and sense definitions jointly, resulting in a 31.1% error reduction on less frequent senses over prior work.

A major obstacle in Word Sense Disambiguation (WSD) is that word senses are not uniformly distributed, causing existing models to generally perform poorly on senses that are either rare or unseen during training. We propose a bi-encoder model that independently embeds (1) the target word with its surrounding context and (2) the dictionary definition, or gloss, of each sense. The encoders are jointly optimized in the same representation space, so that sense disambiguation can be performed by finding the nearest sense embedding for each target word embedding. Our system outperforms previous state-of-the-art models on English all-words WSD; these gains predominantly come from improved performance on rare senses, leading to a 31.1% error reduction on less frequent senses over prior work. This demonstrates that rare senses can be more effectively disambiguated by modeling their definitions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes