CLFeb 15, 2021

How COVID-19 Is Changing Our Language : Detecting Semantic Shift in Twitter Word Embeddings

Yanzhu Guo, Christos Xypolopoulos, Michalis Vazirgiannis

arXiv:2102.07836v11.610 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of tracking how major events like COVID-19 affect language semantics for researchers in computational linguistics and social media analysis, but it is incremental as it applies existing alignment methods to new data.

The research tackled detecting semantic shifts in Twitter language due to COVID-19 by training word embeddings from Twitter data across different time periods and using an alignment-based approach to compare them, verifying the method's validity through case studies and quantifying global shift with a stability measure.

Words are malleable objects, influenced by events that are reflected in written texts. Situated in the global outbreak of COVID-19, our research aims at detecting semantic shifts in social media language triggered by the health crisis. With COVID-19 related big data extracted from Twitter, we train separate word embedding models for different time periods after the outbreak. We employ an alignment-based approach to compare these embeddings with a general-purpose Twitter embedding unrelated to COVID-19. We also compare our trained embeddings among them to observe diachronic evolution. Carrying out case studies on a set of words chosen by topic detection, we verify that our alignment approach is valid. Finally, we quantify the size of global semantic shift by a stability measure based on back-and-forth rotational alignment.

View on arXiv PDF

Similar