Breaking the Communities: Characterizing community changing users using text mining and graph machine learning on Twitter
This work addresses political polarization and echo chambers on social media, offering a method to identify users who bridge ideological divides, though it is incremental in applying existing NLP and graph ML techniques to this domain.
The paper tackled the problem of ideological isolation on social media by characterizing users who switch between polarized communities on Twitter, finding that these 'community breakers' have low PageRank values, indicating their messages receive little response in their original communities.
Even though the Internet and social media have increased the amount of news and information people can consume, most users are only exposed to content that reinforces their positions and isolates them from other ideological communities. This environment has real consequences with great impact on our lives like severe political polarization, easy spread of fake news, political extremism, hate groups and the lack of enriching debates, among others. Therefore, encouraging conversations between different groups of users and breaking the closed community is of importance for healthy societies. In this paper, we characterize and study users who break their community on Twitter using natural language processing techniques and graph machine learning algorithms. In particular, we collected 9 million Twitter messages from 1.5 million users and constructed the retweet networks. We identified their communities and topics of discussion associated to them. With this data, we present a machine learning framework for social media users classification which detects "community breakers", i.e. users that swing from their closed community to another one. A feature importance analysis in three Twitter polarized political datasets showed that these users have low values of PageRank, suggesting that changes are driven because their messages have no response in their communities. This methodology also allowed us to identify their specific topics of interest, providing a fully characterization of this kind of users.