CLOct 19, 2022
How Hate Speech Varies by Target Identity: A Computational AnalysisMichael Miller Yoder, Lynnette Hui Xian Ng, David West Brown et al. · cmu
This paper investigates how hate speech varies in systematic ways according to the identities it targets. Across multiple hate speech datasets annotated for targeted identities, we find that classifiers trained on hate speech targeting specific identity groups struggle to generalize to other targeted identities. This provides empirical evidence for differences in hate speech by target identity; we then investigate which patterns structure this variation. We find that the targeted demographic category (e.g. gender/sexuality or race/ethnicity) appears to have a greater effect on the language of hate speech than does the relative social power of the targeted identity group. We also find that words associated with hate speech targeting specific identities often relate to stereotypes, histories of oppression, current social movements, and other social contexts specific to identities. These experiments suggest the importance of considering targeted identity, as well as the social contexts associated with these identities, in automated hate speech classification.
CLJun 27, 2023
Identity Construction in a Misogynist Incels ForumMichael Miller Yoder, Chloe Perry, David West Brown et al. · cmu
Online communities of involuntary celibates (incels) are a prominent source of misogynist hate speech. In this paper, we use quantitative text and network analysis approaches to examine how identity groups are discussed on incels-dot-is, the largest black-pilled incels forum. We find that this community produces a wide range of novel identity terms and, while terms for women are most common, mentions of other minoritized identities are increasing. An analysis of the associations made with identity groups suggests an essentialist ideology where physical appearance, as well as gender and racial hierarchies, determine human value. We discuss implications for research into automated misogynist hate speech detection.
CLJun 27, 2023
A Weakly Supervised Classifier and Dataset of White Supremacist LanguageMichael Miller Yoder, Ahmad Diab, David West Brown et al. · cmu
We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.
SIJun 1
The Structural Influence of Low-Credibility Narratives During the COVID-19 Vaccine RolloutLynnette Hui Xian Ng, Wenqi Zhou, Kathleen M. Carley
This work examines the structural influence of low-credibility narratives and the comparative role of automated accounts (bots) versus human users on social media platforms. To more accurately quantify the structural influence of a narrative on social media, this study proposes two novel metrics: (1) Appeal, which measures the network-weighted popularity of a message; and (2) Scope, which measures an author's message popularity-weighted network penetration. Applying these metrics, this study analyzes 5.8 million messages from X that contain low-credibility narratives regarding COVID-19 vaccine across three distinct temporal stages: Pre-Vaccine, Vaccine Launch, and Post-Launch. The results demonstrate that across all timeframes, human-distributed low-credibility narratives achieved higher structural influence compared to those generated by automated accounts. Furthermore, statistical analysis reveals a significant conditional temporal effect: human-driven low-credibility narratives attained their highest Appeal and Scope during the focal Vaccine Launch week, whereas automated accounts maximized their Appeal and Scope during the highly uncertain Pre-Vaccine period. These findings highlight the distinct operational capacities of automated and organic accounts, illustrating how the Appeal and Scope of low-credibility narratives is moderated by the lifecycle stages of critical public events.
SIMay 5
Automated versus Human Engagement: Mapping Cognitive Bias Triggers in Online DiscourseLynnette Hui Xian Ng, Wenqi Zhou, Kathleen M. Carley
In the digital environment, human attention is frequently guided by cognitive heuristics rather than deliberate evaluation. Since low-credibility narratives often lack substantive factual evidence, their diffusion disproportionally relies on activating these mental shortcut to simulate credibility and capture attention. This study presents a computational framework designed to detect computational triggers through observable data proxies for eight distinct cognitive biases across 3.5 million posts of contested COVID-19 narratives. We demonstrate that automated accounts (bots) embed these triggers more frequently than human users, yielding distinctly source-dependent associations with audience interaction. In bot-authored posts, affective and cognitive dissonance (stance-shifting) triggers are strongly associated with higher engagement, while the deployment of authority and availability (repetition) cues correlates with reduced audience interaction. Furthermore, we identify limits to heuristic compounding: positive engagement correlations with bot-authored content declines when multiple biases are stacked within a single post, whereas human-authored communication remains structurally resilient to high trigger density. By operationalizing psychological heuristics into scalable, measurable data, this work bridges computational social science and cognitive psychology to reveal how source identity (bot/human) shapes the mechanics of information diffusion in digital networks.
AIJan 15
Generative AI collective behavior needs an interactionist paradigmLaura Ferrarotti, Gian Maria Campedelli, Roberto Dessì et al.
In this article, we argue that understanding the collective behavior of agents based on large language models (LLMs) is an essential area of inquiry, with important implications in terms of risks and benefits, impacting us as a society at many levels. We claim that the distinctive nature of LLMs--namely, their initialization with extensive pre-trained knowledge and implicit social priors, together with their capability of adaptation through in-context learning--motivates the need for an interactionist paradigm consisting of alternative theoretical foundations, methodologies, and analytical tools, in order to systematically examine how prior knowledge and embedded values interact with social context to shape emergent phenomena in multi-agent generative AI systems. We propose and discuss four directions that we consider crucial for the development and deployment of LLM-based collectives, focusing on theory, methods, and trans-disciplinary dialogue.
LGNov 14, 2023
Purpose in the Machine: Do Traffic Simulators Produce Distributionally Equivalent Outcomes for Reinforcement Learning Applications?Rex Chen, Kathleen M. Carley, Fei Fang et al.
Traffic simulators are used to generate data for learning in intelligent transportation systems (ITSs). A key question is to what extent their modelling assumptions affect the capabilities of ITSs to adapt to various scenarios when deployed in the real world. This work focuses on two simulators commonly used to train reinforcement learning (RL) agents for traffic applications, CityFlow and SUMO. A controlled virtual experiment varying driver behavior and simulation scale finds evidence against distributional equivalence in RL-relevant measures from these simulators, with the root mean squared error and KL divergence being significantly greater than 0 for all assessed measures. While granular real-world validation generally remains infeasible, these findings suggest that traffic simulators are not a deus ex machina for RL training: understanding the impacts of inter-simulator differences is necessary to train and deploy RL-based ITSs.
MAMay 8
Social Theory Should Be a Structural Prior for Agentic AI: A Formal Framework for Multi-Agent Social SystemsLynnette Hui Xian Ng, Iain J. Cruickshank, Adrian Xuan Wei Lim et al.
Agentic AI systems are increasingly deployed not in isolation, but inside social environments populated by other agents and humans, such as in social media platforms, multi-agent LLM pipelines or autonomous robotics fleets. In these settings, system behavior emerges not from individual agents alone, but from the multi-agent interactions over time. Emergent dynamics of individuals in a social group have been long studied by social scientists in human contexts. \textbf{This position paper argues that agentic AI systems must be modeled with social theory as a structural prior, and formalizes a Multi-Agent Social Systems (MASS) framework for how agents interact and influence to generate system-level outcomes.} We represent MASS as a class of dynamical system of information generation, local influence and interaction structure, formulated by four structural priors anchored in social theory: strategic heterogeneity, networked-constrained dependence, co-evolution and distributional instability. We demonstrate the importance of each structural prior through formal propositions, and articulate a research agenda for how MASS should be modeled, evaluated and governed.
SIAug 1, 2025
Are LLM-Powered Social Media Bots Realistic?Lynnette Hui Xian Ng, Kathleen M. Carley
As Large Language Models (LLMs) become more sophisticated, there is a possibility to harness LLMs to power social media bots. This work investigates the realism of generating LLM-Powered social media bot networks. Through a combination of manual effort, network science and LLMs, we create synthetic bot agent personas, their tweets and their interactions, thereby simulating social media networks. We compare the generated networks against empirical bot/human data, observing that both network and linguistic properties of LLM-Powered Bots differ from Wild Bots/Humans. This has implications towards the detection and effectiveness of LLM-Powered Bots.
CYJan 1, 2025
What is a Social Media Bot? A Global Comparison of Bot and Human CharacteristicsLynnette Hui Xian Ng, Kathleen M. Carley
Chatter on social media is 20% bots and 80% humans. Chatter by bots and humans is consistently different: bots tend to use linguistic cues that can be easily automated while humans use cues that require dialogue understanding. Bots use words that match the identities they choose to present, while humans may send messages that are not related to the identities they present. Bots and humans differ in their communication structure: sampled bots have a star interaction structure, while sampled humans have a hierarchical structure. These conclusions are based on a large-scale analysis of social media tweets across ~200mil users across 7 events. Social media bots took the world by storm when social-cybersecurity researchers realized that social media users not only consisted of humans but also of artificial agents called bots. These bots wreck havoc online by spreading disinformation and manipulating narratives. Most research on bots are based on special-purposed definitions, mostly predicated on the event studied. This article first begins by asking, "What is a bot?", and we study the underlying principles of how bots are different from humans. We develop a first-principle definition of a social media bot. With this definition as a premise, we systematically compare characteristics between bots and humans across global events, and reflect on how the software-programmed bot is an Artificial Intelligent algorithm, and its potential for evolution as technology advances. Based on our results, we provide recommendations for the use and regulation of bots. Finally, we discuss open challenges and future directions: Detect, to systematically identify these automated and potentially evolving bots; Differentiate, to evaluate the goodness of the bot in terms of their content postings and relationship interactions; Disrupt, to moderate the impact of malicious bots.
LGFeb 19, 2025
Quantifying Memorization and Parametric Response Rates in Retrieval-Augmented Vision-Language ModelsPeter Carragher, Abhinand Jha, R Raghav et al.
Large Language Models (LLMs) demonstrate remarkable capabilities in question answering (QA), but metrics for assessing their reliance on memorization versus retrieval remain underdeveloped. Moreover, while finetuned models are state-of-the-art on closed-domain tasks, general-purpose models like GPT-4o exhibit strong zero-shot performance. This raises questions about the trade-offs between memorization, generalization, and retrieval. In this work, we analyze the extent to which multimodal retrieval-augmented VLMs memorize training data compared to baseline VLMs. Using the WebQA benchmark, we contrast finetuned models with baseline VLMs on multihop retrieval and question answering, examining the impact of finetuning on data memorization. To quantify memorization in end-to-end retrieval and QA systems, we propose several proxy metrics by investigating instances where QA succeeds despite retrieval failing. In line with existing work, we find that finetuned models rely more heavily on memorization than retrieval-augmented VLMs, and achieve higher accuracy as a result (72% vs 52% on WebQA test set). Finally, we present the first empirical comparison of the parametric effect between text and visual modalities. Here, we find that image-based questions have parametric response rates that are consistently 15-25% higher than for text-based questions in the WebQA dataset. As such, our measures pose a challenge for future work, both to account for differences in model memorization across different modalities and more generally to reconcile memorization and generalization in joint Retrieval-QA tasks.
CVFeb 19, 2025
SegSub: Evaluating Robustness to Knowledge Conflicts and Hallucinations in Vision-Language ModelsPeter Carragher, Nikitha Rao, Abhinand Jha et al.
Vision language models (VLM) demonstrate sophisticated multimodal reasoning yet are prone to hallucination when confronted with knowledge conflicts, impeding their deployment in information-sensitive contexts. While existing research addresses robustness in unimodal models, the multimodal domain lacks systematic investigation of cross-modal knowledge conflicts. This research introduces \segsub, a framework for applying targeted image perturbations to investigate VLM resilience against knowledge conflicts. Our analysis reveals distinct vulnerability patterns: while VLMs are robust to parametric conflicts (20% adherence rates), they exhibit significant weaknesses in identifying counterfactual conditions (<30% accuracy) and resolving source conflicts (<1% accuracy). Correlations between contextual richness and hallucination rate (r = -0.368, p = 0.003) reveal the kinds of images that are likely to cause hallucinations. Through targeted fine-tuning on our benchmark dataset, we demonstrate improvements in VLM knowledge conflict detection, establishing a foundation for developing hallucination-resilient multimodal systems in information-sensitive environments.
SIJun 17, 2024
Bridging Social Media and Search Engines: Dredge Words and the Detection of Unreliable DomainsEvan M. Williams, Peter Carragher, Kathleen M. Carley
Proactive content moderation requires platforms to rapidly and continuously evaluate the credibility of websites. Leveraging the direct and indirect paths users follow to unreliable websites, we develop a website credibility classification and discovery system that integrates both webgraph and large-scale social media contexts. We additionally introduce the concept of dredge words, terms or phrases for which unreliable domains rank highly on search engines, and provide the first exploration of their usage on social media. Our graph neural networks that combine webgraph and social media contexts generate to state-of-the-art results in website credibility classification and significantly improves the top-k identification of unreliable domains. Additionally, we release a novel dataset of dredge words, highlighting their strong connections to both social media and online commerce platforms.
CVMay 10, 2024
Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA BenchmarkEvan M. Williams, Kathleen M. Carley
We evaluate the zero-shot ability of GPT-4 and LLaVa to perform simple Visual Network Analysis (VNA) tasks on small-scale graphs. We evaluate the Vision Language Models (VLMs) on 5 tasks related to three foundational network science concepts: identifying nodes of maximal degree on a rendered graph, identifying whether signed triads are balanced or unbalanced, and counting components. The tasks are structured to be easy for a human who understands the underlying graph theoretic concepts, and can all be solved by counting the appropriate elements in graphs. We find that while GPT-4 consistently outperforms LLaVa, both models struggle with every visual network analysis task we propose. We publicly release the first benchmark for the evaluation of VLMs on foundational VNA tasks.
SIDec 15, 2021
Multi-modal Networks Reveal Patterns of Operational Similarity of Terrorist OrganizationsGian Maria Campedelli, Iain J. Cruickshank, Kathleen M. Carley
Capturing dynamics of operational similarity among terrorist groups is critical to provide actionable insights for counter-terrorism and intelligence monitoring. Yet, in spite of its theoretical and practical relevance, research addressing this problem is currently lacking. We tackle this problem proposing a novel computational framework for detecting clusters of terrorist groups sharing similar behaviors, focusing on groups' yearly repertoire of deployed tactics, attacked targets, and utilized weapons. Specifically considering those organizations that have plotted at least 50 attacks from 1997 to 2018, accounting for a total of 105 groups responsible for more than 42,000 events worldwide, we offer three sets of results. First, we show that over the years global terrorism has been characterized by increasing operational cohesiveness. Second, we highlight that year-to-year stability in co-clustering among groups has been particularly high from 2009 to 2018, indicating temporal consistency of similarity patterns in the last decade. Third, we demonstrate that operational similarity between two organizations is driven by three factors: (a) their overall activity; (b) the difference in the diversity of their operational repertoires; (c) the difference in a combined measure of diversity and activity. Groups' operational preferences, geographical homophily and ideological affinity have no consistent role in determining operational similarity.
CLNov 12, 2021
RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location EstimationYu Zhang, Wei Wei, Binxuan Huang et al.
Real-time location inference of social media users is the fundamental of some spatial applications such as localized search and event detection. While tweet text is the most commonly used feature in location estimation, most of the prior works suffer from either the noise or the sparsity of textual features. In this paper, we aim to tackle these two problems. We use topic modeling as a building block to characterize the geographic topic variation and lexical variation so that "one-hot" encoding vectors will no longer be directly used. We also incorporate other features which can be extracted through the Twitter streaming API to overcome the noise problem. Experimental results show that our RATE algorithm outperforms several benchmark methods, both in the precision of region classification and the mean distance error of latitude and longitude regression.
SISep 2, 2021
Coordinating Narratives and the Capitol Riots on ParlerLynnette Hui Xian Ng, Iain Cruickshank, Kathleen M. Carley
Coordinated disinformation campaigns are used to influence social media users, potentially leading to offline violence. In this study, we introduce a general methodology to uncover coordinated messaging through analysis of user parleys on Parler. The proposed method constructs a user-to-user coordination network graph induced by a user-to-text graph and a text-to-text similarity graph. The text-to-text graph is constructed based on the textual similarity of Parler posts. We study three influential groups of users in the 6 January 2020 Capitol riots and detect networks of coordinated user clusters that are all posting similar textual content in support of different disinformation narratives related to the U.S. 2020 elections.
LGApr 21, 2021
Learning future terrorist targets through temporal meta-graphsGian Maria Campedelli, Mihovil Bartulovic, Kathleen M. Carley
In the last 20 years, terrorism has led to hundreds of thousands of deaths and massive economic, political, and humanitarian crises in several regions of the world. Using real-world data on attacks occurred in Afghanistan and Iraq from 2001 to 2018, we propose the use of temporal meta-graphs and deep learning to forecast future terrorist targets. Focusing on three event dimensions, i.e., employed weapons, deployed tactics and chosen targets, meta-graphs map the connections among temporally close attacks, capturing their operational similarities and dependencies. From these temporal meta-graphs, we derive 2-day-based time series that measure the centrality of each feature within each dimension over time. Formulating the problem in the context of the strategic behavior of terrorist actors, these multivariate temporal sequences are then utilized to learn what target types are at the highest risk of being chosen. The paper makes two contributions. First, it demonstrates that engineering the feature space via temporal meta-graphs produces richer knowledge than shallow time-series that only rely on frequency of feature occurrences. Second, the performed experiments reveal that bi-directional LSTM networks achieve superior forecasting performance compared to other algorithms, calling for future research aiming at fully discovering the potential of artificial intelligence to counter terrorist violence.
SIApr 2, 2021
The Coronavirus is a Bioweapon: Analysing Coronavirus Fact-Checked StoriesLynnette Hui Xian Ng, Kathleen M. Carley
The 2020 coronavirus pandemic has heightened the need to flag coronavirus-related misinformation, and fact-checking groups have taken to verifying misinformation on the Internet. We explore stories reported by fact-checking groups PolitiFact, Poynter and Snopes from January to June 2020, characterising them into six story clusters before then analyse time-series and story validity trends and the level of agreement across sites. We further break down the story clusters into more granular story types by proposing a unique automated method with a BERT classifier, which can be used to classify diverse story sources, in both fact-checked stories and tweets.
CLMar 12, 2021
A Weakly Supervised Approach for Classifying Stance in Twitter RepliesSumeet Kumar, Ramon Villa Cox, Matthew Babcock et al.
Conversations on social media (SM) are increasingly being used to investigate social issues on the web, such as online harassment and rumor spread. For such issues, a common thread of research uses adversarial reactions, e.g., replies pointing out factual inaccuracies in rumors. Though adversarial reactions are prevalent in online conversations, inferring those adverse views (or stance) from the text in replies is difficult and requires complex natural language processing (NLP) models. Moreover, conventional NLP models for stance mining need labeled data for supervised learning. Getting labeled conversations can itself be challenging as conversations can be on any topic, and topics change over time. These challenges make learning the stance a difficult NLP problem. In this research, we first create a new stance dataset comprised of three different topics by labeling both users' opinions on the topics (as in pro/con) and users' stance while replying to others' posts (as in favor/oppose). As we find limitations with supervised approaches, we propose a weakly-supervised approach to predict the stance in Twitter replies. Our novel method allows using a smaller number of hashtags to generate weak labels for Twitter replies. Compared to supervised learning, our method improves the mean F1-macro by 8\% on the hand-labeled dataset without using any hand-labeled examples in the training set. We further show the applicability of our proposed method on COVID 19 related conversations on Twitter.
SIAug 3, 2020
Characterizing Communities of Hashtag Usage on Twitter During the 2020 COVID-19 Pandemic by Multi-view ClusteringIain J. Cruickshank, Kathleen M. Carley
The COVID-19 pandemic has produced a flurry of online activity on social media sites. As such, analysis of social media data during the COVID-19 pandemic can produce unique insights into discussion topics and how those topics evolve over the course of the pandemic. In this study, we propose analyzing discussion topics on Twitter by clustering hashtags. In order to obtain high-quality clusters of the Twitter hashtags, we also propose a novel multi-view clustering technique that incorporates multiple different data types that can be used to describe how users interact with hashtags. The results of our multi-view clustering show that there are distinct temporal and topical trends present within COVID-19 twitter discussion. In particular, we find that some topical clusters of hashtags shift over the course of the pandemic, while others are persistent throughout, and that there are distinct temporal trends in hashtag usage. This study is the first to use multi-view clustering to analyze hashtags and the first analysis of the greater trends of discussion occurring online during the COVID-19 pandemic.
SIAug 3, 2020
Characterizing COVID-19 Misinformation Communities Using a Novel Twitter DatasetShahan Ali Memon, Kathleen M. Carley
From conspiracy theories to fake cures and fake treatments, COVID-19 has become a hot-bed for the spread of misinformation online. It is more important than ever to identify methods to debunk and correct false information online. In this paper, we present a methodology and analyses to characterize the two competing COVID-19 misinformation communities online: (i) misinformed users or users who are actively posting misinformation, and (ii) informed users or users who are actively spreading true information, or calling out misinformation. The goals of this study are two-fold: (i) collecting a diverse set of annotated COVID-19 Twitter dataset that can be used by the research community to conduct meaningful analysis; and (ii) characterizing the two target communities in terms of their network structure, linguistic patterns, and their membership in other communities. Our analyses show that COVID-19 misinformed communities are denser, and more organized than informed communities, with a possibility of a high volume of the misinformation being part of disinformation campaigns. Our analyses also suggest that a large majority of misinformed users may be anti-vaxxers. Finally, our sociolinguistic analyses suggest that COVID-19 informed users tend to use more narratives than misinformed users.
SIJul 15, 2020
Bot-Match: Social Bot Detection with Recursive Nearest Neighbors SearchDavid M. Beskow, Kathleen M. Carley
Social bots have emerged over the last decade, initially creating a nuisance while more recently used to intimidate journalists, sway electoral events, and aggravate existing social fissures. This social threat has spawned a bot detection algorithms race in which detection algorithms evolve in an attempt to keep up with increasingly sophisticated bot accounts. This cat and mouse cycle has illuminated the limitations of supervised machine learning algorithms, where researchers attempt to use yesterday's data to predict tomorrow's bots. This gap means that researchers, journalists, and analysts daily identify malicious bot accounts that are undetected by state of the art supervised bot detection algorithms. These analysts often desire to find similar bot accounts without labeling/training a new model, where similarity can be defined by content, network position, or both. A similarity based algorithm could complement existing supervised and unsupervised methods and fill this gap. To this end, we present the Bot-Match methodology in which we evaluate social media embeddings that enable a semi-supervised recursive nearest neighbors search to map an emerging social cybersecurity threat given one or more seed accounts.
SIJun 8, 2020
Characterizing Sociolinguistic Variation in the Competing Vaccination CommunitiesShahan Ali Memon, Aman Tyagi, David R. Mortensen et al.
Public health practitioners and policy makers grapple with the challenge of devising effective message-based interventions for debunking public health misinformation in cyber communities. "Framing" and "personalization" of the message is one of the key features for devising a persuasive messaging strategy. For an effective health communication, it is imperative to focus on "preference-based framing" where the preferences of the target sub-community are taken into consideration. To achieve that, it is important to understand and hence characterize the target sub-communities in terms of their social interactions. In the context of health-related misinformation, vaccination remains to be the most prevalent topic of discord. Hence, in this paper, we conduct a sociolinguistic analysis of the two competing vaccination communities on Twitter: "pro-vaxxers" or individuals who believe in the effectiveness of vaccinations, and "anti-vaxxers" or individuals who are opposed to vaccinations. Our data analysis show significant linguistic variation between the two communities in terms of their usage of linguistic intensifiers, pronouns, and uncertainty words. Our network-level analysis show significant differences between the two communities in terms of their network density, echo-chamberness, and the EI index. We hypothesize that these sociolinguistic differences can be used as proxies to characterize and understand these communities to devise better message interventions.
CLJun 1, 2020
Stance in Replies and Quotes (SRQ): A New Dataset For Learning Stance in Twitter ConversationsRamon Villa-Cox, Sumeet Kumar, Matthew Babcock et al.
Automated ways to extract stance (denying vs. supporting opinions) from conversations on social media are essential to advance opinion mining research. Recently, there is a renewed excitement in the field as we see new models attempting to improve the state-of-the-art. However, for training and evaluating the models, the datasets used are often small. Additionally, these small datasets have uneven class distributions, i.e., only a tiny fraction of the examples in the dataset have favoring or denying stances, and most other examples have no clear stance. Moreover, the existing datasets do not distinguish between the different types of conversations on social media (e.g., replying vs. quoting on Twitter). Because of this, models trained on one event do not generalize to other events. In the presented work, we create a new dataset by labeling stance in responses to posts on Twitter (both replies and quotes) on controversial issues. To the best of our knowledge, this is currently the largest human-labeled stance dataset for Twitter conversations with over 5200 stance labels. More importantly, we designed a tweet collection methodology that favors the selection of denial-type responses. This class is expected to be more useful in the identification of rumors and determining antagonistic relationships between users. Moreover, we include many baseline models for learning the stance in conversations and compare the performance of various models. We show that combining data from replies and quotes decreases the accuracy of models indicating that the two modalities behave differently when it comes to stance learning.
SIMay 20, 2020
A Computational Analysis of Polarization on Indian and Pakistani Social MediaAman Tyagi, Anjalie Field, Priyank Lathwal et al.
Between February 14, 2019 and March 4, 2019, a terrorist attack in Pulwama, Kashmir followed by retaliatory airstrikes led to rising tensions between India and Pakistan, two nuclear-armed countries. In this work, we examine polarizing messaging on Twitter during these events, particularly focusing on the positions of Indian and Pakistani politicians. We use a label propagation technique focused on hashtag co-occurrences to find polarizing tweets and users. Our analysis reveals that politicians in the ruling political party in India (BJP) used polarized hashtags and called for escalation of conflict more so than politicians from other parties. Our work offers the first analysis of how escalating tensions between India and Pakistan manifest on Twitter and provides a framework for studying polarizing messages.
SIMar 3, 2020
Discover Your Social Identity from What You Tweet: a Content Based ApproachBinxuan Huang, Kathleen M. Carley
An identity denotes the role an individual or a group plays in highly differentiated contemporary societies. In this paper, our goal is to classify Twitter users based on their role identities. We first collect a coarse-grained public figure dataset automatically, then manually label a more fine-grained identity dataset. We propose a hierarchical self-attention neural network for Twitter user role identity classification. Our experiments demonstrate that the proposed model significantly outperforms multiple baselines. We further propose a transfer learning scheme that improves our model's performance by a large margin. Such transfer learning also greatly reduces the need for a large amount of human labeled data.
SIOct 28, 2019
A Hierarchical Location Prediction Neural Network for Twitter User GeolocationBinxuan Huang, Kathleen M. Carley
Accurate estimation of user location is important for many online services. Previous neural network based methods largely ignore the hierarchical structure among locations. In this paper, we propose a hierarchical location prediction neural network for Twitter user geolocation. Our model first predicts the home country for a user, then uses the country result to guide the city-level prediction. In addition, we employ a character-aware word embedding layer to overcome the noisy information in tweets. With the feature fusion layer, our model can accommodate various feature combinations and achieves state-of-the-art results over three commonly used benchmarks under different feature settings. It not only improves the prediction accuracy but also greatly reduces the mean error distance.
LGOct 2, 2019
Graph-Hist: Graph Classification from Latent Feature Histograms With Application to Bot DetectionThomas Magelinski, David Beskow, Kathleen M. Carley
Neural networks are increasingly used for graph classification in a variety of contexts. Social media is a critical application area in this space, however the characteristics of social media graphs differ from those seen in most popular benchmark datasets. Social networks tend to be large and sparse, while benchmarks are small and dense. Classically, large and sparse networks are analyzed by studying the distribution of local properties. Inspired by this, we introduce Graph-Hist: an end-to-end architecture that extracts a graph's latent local features, bins nodes together along 1-D cross sections of the feature space, and classifies the graph based on this multi-channel histogram. We show that Graph-Hist improves state of the art performance on true social media benchmark datasets, while still performing well on other benchmarks. Finally, we demonstrate Graph-Hist's performance by conducting bot detection in social media. While sophisticated bot and cyborg accounts increasingly evade traditional detection methods, they leave artificial artifacts in their conversational graph that are detected through graph classification. We apply Graph-Hist to classify these conversational graphs. In the process, we confirm that social media graphs are different than most baselines and that Graph-Hist outperforms existing bot-detection models.
CLSep 13, 2019
Parameterized Convolutional Neural Networks for Aspect Level Sentiment ClassificationBinxuan Huang, Kathleen M. Carley
We introduce a novel parameterized convolutional neural network for aspect level sentiment classification. Using parameterized filters and parameterized gates, we incorporate aspect information into convolutional neural networks (CNN). Experiments demonstrate that our parameterized filters and parameterized gates effectively capture the aspect-specific features, and our CNN-based models achieve excellent results on SemEval 2014 datasets.
CLSep 5, 2019
Syntax-Aware Aspect Level Sentiment Classification with Graph Attention NetworksBinxuan Huang, Kathleen M. Carley
Aspect level sentiment classification aims to identify the sentiment expressed towards an aspect given a context sentence. Previous neural network based methods largely ignore the syntax structure in one sentence. In this paper, we propose a novel target-dependent graph attention network (TD-GAT) for aspect level sentiment classification, which explicitly utilizes the dependency relationship among words. Using the dependency graph, it propagates sentiment features directly from the syntactic context of an aspect target. In our experiments, we show our method outperforms multiple baselines with GloVe embeddings. We also demonstrate that using BERT representations further substantially boosts the performance.
SIAug 28, 2019
A Large-Scale Empirical Study of Geotagging Behavior on TwitterBinxuan Huang, Kathleen M. Carley
Geotagging on social media has become an important proxy for understanding people's mobility and social events. Research that uses geotags to infer public opinions relies on several key assumptions about the behavior of geotagged and non-geotagged users. However, these assumptions have not been fully validated. Lack of understanding the geotagging behavior prohibits people further utilizing it. In this paper, we present an empirical study of geotagging behavior on Twitter based on more than 40 billion tweets collected from 20 million users. There are three main findings that may challenge these common assumptions. Firstly, different groups of users have different geotagging preferences. For example, less than 3% of users speaking in Korean are geotagged, while more than 40% of users speaking in Indonesian use geotags. Secondly, users who report their locations in profiles are more likely to use geotags, which may affects the generability of those location prediction systems on non-geotagged users. Thirdly, strong homophily effect exists in users' geotagging behavior, that users tend to connect to friends with similar geotagging preferences.
LGApr 17, 2019
Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation LearningBinxuan Huang, Kathleen M. Carley
In this paper, we study the problem of node representation learning with graph neural networks. We present a graph neural network class named recurrent graph neural network (RGNN), that address the shortcomings of prior methods. By using recurrent units to capture the long-term dependency across layers, our methods can successfully identify important information during recursive neighborhood expansion. In our experiments, we show that our model class achieves state-of-the-art results on three benchmarks: the Pubmed, Reddit, and PPI network datasets. Our in-depth analyses also demonstrate that incorporating recurrent units is a simple yet effective method to prevent noisy information in graphs, which enables a deeper graph neural network.
CLApr 18, 2018
Aspect Level Sentiment Classification with Attention-over-Attention Neural NetworksBinxuan Huang, Yanglan Ou, Kathleen M. Carley
Aspect-level sentiment classification aims to identify the sentiment expressed towards some aspects given context sentences. In this paper, we introduce an attention-over-attention (AOA) neural network for aspect level sentiment classification. Our approach models aspects and sentences in a joint way and explicitly captures the interaction between aspects and context sentences. With the AOA module, our model jointly learns the representations for aspects and sentences, and automatically focuses on the important parts in sentences. Our experiments on laptop and restaurant datasets demonstrate our approach outperforms previous LSTM-based architectures.
AIFeb 23, 2017
A Probabilistic Framework for Location Inference from Social MediaYujie Qian, Jie Tang, Zhilin Yang et al.
We study the extent to which we can infer users' geographical locations from social media. Location inference from social media can benefit many applications, such as disaster management, targeted advertising, and news content tailoring. The challenges, however, lie in the limited amount of labeled data and the large scale of social networks. In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM). The model provides a probabilistic framework in which various sources of information (e.g., content and social network) can be combined together. We design a two-layer neural network to learn feature representations, and incorporate the learned latent features into SSFGM. To deal with the large-scale problem, we propose a Two-Chain Sampling (TCS) algorithm to learn SSFGM. The algorithm achieves a good trade-off between accuracy and efficiency. Experiments on Twitter and Weibo show that the proposed TCS algorithm for SSFGM can substantially improve the inference accuracy over several state-of-the-art methods. More importantly, TCS achieves over 100x speedup comparing with traditional propagation-based methods (e.g., loopy belief propagation).