SI AI IRFeb 4, 2021

High-level Approaches to Detect Malicious Political Activity on Twitter

arXiv:2102.04293v11.22 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of detecting malicious political activity on Twitter, which is a significant challenge for platform integrity and democratic processes, specifically focusing on the Portuguese context.

This paper investigates malicious political activity on Twitter, particularly within the Portuguese Twittersphere, by analyzing a dataset of 5 million accounts and over 120 million tweets from August 2019 to May 2020. The authors found that Twitter's suspension patterns are inadequate for the political trolling and fake news identified, and that malicious accounts are very similar to each other but distinct from regular accounts in both content and interaction.

Our work represents another step into the detection and prevention of these ever-more present political manipulation efforts. We, therefore, start by focusing on understanding what the state-of-the-art approaches lack -- since the problem remains, this is a fair assumption. We find concerning issues within the current literature and follow a diverging path. Notably, by placing emphasis on using data features that are less susceptible to malicious manipulation and also on looking for high-level approaches that avoid a granularity level that is biased towards easy-to-spot and low impact cases. We designed and implemented a framework -- Twitter Watch -- that performs structured Twitter data collection, applying it to the Portuguese Twittersphere. We investigate a data snapshot taken on May 2020, with around 5 million accounts and over 120 million tweets (this value has since increased to over 175 million). The analyzed time period stretches from August 2019 to May 2020, with a focus on the Portuguese elections of October 6th, 2019. However, the Covid-19 pandemic showed itself in our data, and we also delve into how it affected typical Twitter behavior. We performed three main approaches: content-oriented, metadata-oriented, and network interaction-oriented. We learn that Twitter's suspension patterns are not adequate to the type of political trolling found in the Portuguese Twittersphere -- identified by this work and by an independent peer - nor to fake news posting accounts. We also surmised that the different types of malicious accounts we independently gathered are very similar both in terms of content and interaction, through two distinct analysis, and are simultaneously very distinct from regular accounts.

View on arXiv PDF

Similar