Short Text Topic Modeling: Application to tweets about Bitcoin
This work addresses the challenge of topic modeling for short texts like social media posts, specifically for Bitcoin-related analysis, but it is incremental as it applies existing methods to new data.
The authors tackled topic modeling on short texts by applying three existing topic models to a corpus of Bitcoin tweets, evaluating them with multiple scores and demonstrating a concrete application of the extracted topics.
Understanding the semantic of a collection of texts is a challenging task. Topic models are probabilistic models that aims at extracting "topics" from a corpus of documents. This task is particularly difficult when the corpus is composed of short texts, such as posts on social networks. Following several previous research papers, we explore in this paper a set of collected tweets about bitcoin. In this work, we train three topic models and evaluate their output with several scores. We also propose a concrete application of the extracted topics.