Probabilistic Model of Narratives Over Topical Trends in Social Media: A Discrete Time Model
This work addresses the need for systematic narrative extraction from social media to facilitate communication of main events, particularly in domains like disinformation, but it is incremental as it builds on existing topic modeling and summarization techniques.
The authors tackled the problem of extracting narrative summaries from social media data by proposing a probabilistic topic model with categorical time distribution and extractive summarization, which effectively identified topical trends and extracted narrative summaries from over one million tweets on disinformation campaigns against the White Helmets of Syria.
Online social media platforms are turning into the prime source of news and narratives about worldwide events. However,a systematic summarization-based narrative extraction that can facilitate communicating the main underlying events is lacking. To address this issue, we propose a novel event-based narrative summary extraction framework. Our proposed framework is designed as a probabilistic topic model, with categorical time distribution, followed by extractive text summarization. Our topic model identifies topics' recurrence over time with a varying time resolution. This framework not only captures the topic distributions from the data, but also approximates the user activity fluctuations over time. Furthermore, we define significance-dispersity trade-off (SDT) as a comparison measure to identify the topic with the highest lifetime attractiveness in a timestamped corpus. We evaluate our model on a large corpus of Twitter data, including more than one million tweets in the domain of the disinformation campaigns conducted against the White Helmets of Syria. Our results indicate that the proposed framework is effective in identifying topical trends, as well as extracting narrative summaries from text corpus with timestamped data.