CLApr 30, 2020

UiO-UvA at SemEval-2020 Task 1: Contextualised Embeddings for Lexical Semantic Change Detection

arXiv:2005.00050v31008 citations
AI Analysis

This work addresses the problem of tracking how word meanings evolve over time for natural language processing researchers, but it is incremental as it builds on existing contextualized embedding methods.

The paper tackled lexical semantic change detection by applying contextualized word embeddings like BERT and ELMo to rank words by semantic drift over time, achieving the best submission in the SemEval-2020 Task 1 Subtask 2 with algorithms based on cosine similarity and pairwise distances that outperformed strong baselines by a large margin.

We apply contextualised word embeddings to lexical semantic change detection in the SemEval-2020 Shared Task 1. This paper focuses on Subtask 2, ranking words by the degree of their semantic drift over time. We analyse the performance of two contextualising architectures (BERT and ELMo) and three change detection algorithms. We find that the most effective algorithms rely on the cosine similarity between averaged token embeddings and the pairwise distances between token embeddings. They outperform strong baselines by a large margin (in the post-evaluation phase, we have the best Subtask 2 submission for SemEval-2020 Task 1), but interestingly, the choice of a particular algorithm depends on the distribution of gold scores in the test set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes