Unsupervised Embedding-based Detection of Lexical Semantic Changes
This work addresses the challenge of tracking word meaning shifts for computational linguistics and NLP applications, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of detecting lexical-semantic changes over time in an unsupervised manner using embedding-based methods, achieving second place in the SemEval-2020 Task 1 competition across multiple languages.
This paper describes EmbLexChange, a system introduced by the "Life-Language" team for SemEval-2020 Task 1, on unsupervised detection of lexical-semantic changes. EmbLexChange is defined as the divergence between the embedding based profiles of word w (calculated with respect to a set of reference words) in the source and the target domains (source and target domains can be simply two time frames t1 and t2). The underlying assumption is that the lexical-semantic change of word w would affect its co-occurring words and subsequently alters the neighborhoods in the embedding spaces. We show that using a resampling framework for the selection of reference words, we can reliably detect lexical-semantic changes in English, German, Swedish, and Latin. EmbLexChange achieved second place in the binary detection of semantic changes in the SemEval-2020.