Quantifying Lexical Semantic Shift via Unbalanced Optimal Transport
This work addresses the challenge of limited insight into instance-level semantic changes in lexical semantic shift detection, offering a unified method for researchers in computational linguistics and NLP, though it is incremental as it builds on existing embedding-based approaches.
The paper tackles the problem of lexical semantic change detection by proposing Sense Usage Shift (SUS), a measure based on Unbalanced Optimal Transport applied to contextualized word embeddings, which quantifies changes in word sense usage frequency at the instance level and addresses tasks like measuring semantic change magnitude and meaning broadening or narrowing.
Lexical semantic change detection aims to identify shifts in word meanings over time. While existing methods using embeddings from a diachronic corpus pair estimate the degree of change for target words, they offer limited insight into changes at the level of individual usage instances. To address this, we apply Unbalanced Optimal Transport (UOT) to sets of contextualized word embeddings, capturing semantic change through the excess and deficit in the alignment between usage instances. In particular, we propose Sense Usage Shift (SUS), a measure that quantifies changes in the usage frequency of a word sense at each usage instance. By leveraging SUS, we demonstrate that several challenges in semantic change detection can be addressed in a unified manner, including quantifying instance-level semantic change and word-level tasks such as measuring the magnitude of semantic change and the broadening or narrowing of meaning.