Retweet-BERT: Political Leaning Detection Using Language Features and Information Diffusion on Social Networks
This addresses the challenge of political leaning detection for social media analysis, though it is incremental as it builds on existing methods with network and language features.
The paper tackled the problem of estimating political leanings of Twitter users by introducing Retweet-BERT, a model that uses retweet networks and profile language, achieving 96%-97% macro-F1 on COVID-19 and U.S. election datasets.
Estimating the political leanings of social media users is a challenging and ever more pressing problem given the increase in social media consumption. We introduce Retweet-BERT, a simple and scalable model to estimate the political leanings of Twitter users. Retweet-BERT leverages the retweet network structure and the language used in users' profile descriptions. Our assumptions stem from patterns of networks and linguistics homophily among people who share similar ideologies. Retweet-BERT demonstrates competitive performance against other state-of-the-art baselines, achieving 96%-97% macro-F1 on two recent Twitter datasets (a COVID-19 dataset and a 2020 United States presidential elections dataset). We also perform manual validation to validate the performance of Retweet-BERT on users not in the training data. Finally, in a case study of COVID-19, we illustrate the presence of political echo chambers on Twitter and show that it exists primarily among right-leaning users. Our code is open-sourced and our data is publicly available.