CY CL LG SIJul 20, 2023

What Twitter Data Tell Us about the Future?

Alina Landowska, Marek Robak, Maciej Skorski

arXiv:2308.02035v14.33 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This provides insights into futures anticipated by Twitter's futurists and enhances topic modeling research, but is incremental as it applies existing NLP methods to new social media data.

This study analyzed over 1 million tweets from futurists on Twitter to identify what futures they anticipate and how language cues influence anticipatory thinking, finding 15 topics from LDA and 100 distinct topics from BERTopic modeling.

Anticipation is a fundamental human cognitive ability that involves thinking about and living towards the future. While language markers reflect anticipatory thinking, research on anticipation from the perspective of natural language processing is limited. This study aims to investigate the futures projected by futurists on Twitter and explore the impact of language cues on anticipatory thinking among social media users. We address the research questions of what futures Twitter's futurists anticipate and share, and how these anticipated futures can be modeled from social data. To investigate this, we review related works on anticipation, discuss the influence of language markers and prestigious individuals on anticipatory thinking, and present a taxonomy system categorizing futures into "present futures" and "future present". This research presents a compiled dataset of over 1 million publicly shared tweets by future influencers and develops a scalable NLP pipeline using SOTA models. The study identifies 15 topics from the LDA approach and 100 distinct topics from the BERTopic approach within the futurists' tweets. These findings contribute to the research on topic modelling and provide insights into the futures anticipated by Twitter's futurists. The research demonstrates the futurists' language cues signals futures-in-the-making that enhance social media users to anticipate their own scenarios and respond to them in present. The fully open-sourced dataset, interactive analysis, and reproducible source code are available for further exploration.

View on arXiv PDF

Similar