CLSep 23, 2022

Temporal Analysis on Topics Using Word2Vec

arXiv:2209.11717v2h-index: 43
Originality Synthesis-oriented
AI Analysis

This addresses the need for more nuanced trend analysis in text data beyond simple word frequency counts, though it is incremental in combining existing techniques.

The study tackled the problem of detecting and visualizing topic trends over time by modeling topic movement using Word2Vec, k-means clustering, and cosine similarity, showing how topics converge or diverge in subtopics.

The present study proposes a novel method of trend detection and visualization - more specifically, modeling the change in a topic over time. Where current models used for the identification and visualization of trends only convey the popularity of a singular word based on stochastic counting of usage, the approach in the present study illustrates the popularity and direction that a topic is moving in. The direction in this case is a distinct subtopic within the selected corpus. Such trends are generated by modeling the movement of a topic by using k-means clustering and cosine similarity to group the distances between clusters over time. In a convergent scenario, it can be inferred that the topics as a whole are meshing (tokens between topics, becoming interchangeable). On the contrary, a divergent scenario would imply that each topics' respective tokens would not be found in the same context (the words are increasingly different to each other). The methodology was tested on a group of articles from various media houses present in the 20 Newsgroups dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes