IRCLLGMLJul 8, 2022

Twitmo: A Twitter Data Topic Modeling and Visualization Package for R

arXiv:2207.11236v13 citationsh-index: 50
Originality Synthesis-oriented
AI Analysis

This provides a user-friendly tool for researchers to analyze public discourse on topics like politics or persons of interest in space and time, though it is incremental as it builds on existing topic modeling methods.

The authors tackled the challenge of analyzing geo-tagged Twitter data by developing Twitmo, an R package that integrates data collection, preprocessing, topic modeling (LDA, CTM, STM), and visualization, with innovations like automatic pooling of tweets for better topic coherence.

We present Twitmo, a package that provides a broad range of methods to collect, pre-process, analyze and visualize geo-tagged Twitter data. Twitmo enables the user to collect geo-tagged Tweets from Twitter and and provides a comprehensive and user-friendly toolbox to generate topic distributions from Latent Dirichlet Allocations (LDA), correlated topic models (CTM) and structural topic models (STM). Functions are included for pre-processing of text, model building and prediction. In addition, one of the innovations of the package is the automatic pooling of Tweets into longer pseudo-documents using hashtags and cosine similarities for better topic coherence. The package additionally comes with functionality to visualize collected data sets and fitted models in static as well as interactive ways and offers built-in support for model visualizations via LDAvis providing great convenience for researchers in this area. The Twitmo package is an innovative toolbox that can be used to analyze public discourse of various topics, political parties or persons of interest in space and time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes