CLSep 20, 2021

BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology

arXiv:2109.09780v1661 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of evaluating sense representation in language models for NLP researchers, but it is incremental as it builds on existing work without introducing a new method.

The study investigated how well contextualized word embedding models like BERT represent uncommon word senses by evaluating their performance in a nearest neighbor retrieval task on sense-annotated corpora, finding that models outperform random baselines for rare senses but vary significantly in performance.

An important question concerning contextualized word embedding (CWE) models like BERT is how well they can represent different word senses, especially those in the long tail of uncommon senses. Rather than build a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and examining ranking performance for words and senses in different frequency bands. In an evaluation on two English sense-annotated corpora, we find that several popular CWE models all outperform a random baseline even for proportionally rare senses, without explicit sense supervision. However, performance varies considerably even among models with similar architectures and pretraining regimes, with especially large differences for rare word senses, revealing that CWE models are not all created equal when it comes to approximating word senses in their native representations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes