MMAug 16, 2024
Detecting Misinformation in Multimedia Content through Cross-Modal Entity Consistency: A Dual Learning ApproachZhe Fu, Kanlun Wang, Wangjiaxuan Xin et al.
The landscape of social media content has evolved significantly, extending from text to multimodal formats. This evolution presents a significant challenge in combating misinformation. Previous research has primarily focused on single modalities or text-image combinations, leaving a gap in detecting multimodal misinformation. While the concept of entity consistency holds promise in detecting multimodal misinformation, simplifying the representation to a scalar value overlooks the inherent complexities of high-dimensional representations across different modalities. To address these limitations, we propose a Multimedia Misinformation Detection (MultiMD) framework for detecting misinformation from video content by leveraging cross-modal entity consistency. The proposed dual learning approach allows for not only enhancing misinformation detection performance but also improving representation learning of entity consistency across different modalities. Our results demonstrate that MultiMD outperforms state-of-the-art baseline models and underscore the importance of each modality in misinformation detection. Our research provides novel methodological and technical insights into multimodal misinformation detection.
NAApr 21
Toward Practical Forecasts of Public Sentiments via Convexification for Mean Field Games: Evidence from Real World COVID-19 Discussion DataShi Chen, Michael V. Klibanov, Kevin McGoff et al.
We apply a convexification-based numerical method to forecast public sentiment dynamics using Mean Field Games (MFGs). The theoretical foundation for the convexification approach, established in our prior work, guarantees global convergence to the unique solution to the MFG system. The present work demonstrates the practical potential of this framework using real-world sentiment data extracted from social media public discussion during the COVID-19 pandemic. The results show that the MFG model with appropriate parameters and convexification yields sentiment density predictions that align closely with observed data and satisfy the governing equations. While current parameter selection relies on manual calibration, our findings establish the first proof-of-concept evidence that MFG models can capture complex temporal patterns in public sentiment, laying the groundwork for future work on systematic parameter identification methods, i.e. solutions of coefficient inverse problems for the MFG system.
SIAug 21, 2024
Let Community Rules Be Reflected in Online Content ModerationWangjiaxuan Xin, Kanlun Wang, Zhe Fu et al.
Content moderation is a widely used strategy to prevent the dissemination of irregular information on social media platforms. Despite extensive research on developing automated models to support decision-making in content moderation, there remains a notable scarcity of studies that integrate the rules of online communities into content moderation. This study addresses this gap by proposing a community rule-based content moderation framework that directly integrates community rules into the moderation of user-generated content. Our experiment results with datasets collected from two domains demonstrate the superior performance of models based on the framework to baseline models across all evaluation metrics. In particular, incorporating community rules substantially enhances model performance in content moderation. The findings of this research have significant research and practical implications for improving the effectiveness and generalizability of content moderation models in online communities.
SISep 15, 2025
Digital Voices of Survival: From Social Media Disclosures to Support Provisions for Domestic Violence VictimsKanlun Wang, Zhe Fu, Wangjiaxuan Xin et al.
Domestic Violence (DV) is a pervasive public health problem characterized by patterns of coercive and abusive behavior within intimate relationships. With the rise of social media as a key outlet for DV victims to disclose their experiences, online self-disclosure has emerged as a critical yet underexplored avenue for support-seeking. In addition, existing research lacks a comprehensive and nuanced understanding of DV self-disclosure, support provisions, and their connections. To address these gaps, this study proposes a novel computational framework for modeling DV support-seeking behavior alongside community support mechanisms. The framework consists of four key components: self-disclosure detection, post clustering, topic summarization, and support extraction and mapping. We implement and evaluate the framework with data collected from relevant social media communities. Our findings not only advance existing knowledge on DV self-disclosure and online support provisions but also enable victim-centered digital interventions.
CLNov 24, 2025
Empathetic Cascading Networks: A Multi-Stage Prompting Technique for Reducing Social Biases in Large Language ModelsWangjiaxuan Xin
This report presents the Empathetic Cascading Networks (ECN) framework, a multi-stage prompting method designed to enhance the empathetic and inclusive capabilities of large language models. ECN employs four stages: Perspective Adoption, Emotional Resonance, Reflective Understanding, and Integrative Synthesis, to guide models toward generating emotionally resonant and contextually aware responses. Experimental results demonstrate that ECN achieves the highest Empathy Quotient (EQ) scores across GPT-3.5-turbo and GPT-4, while maintaining competitive Regard and Perplexity metrics. These findings emphasize ECN's potential for applications requiring empathy and inclusivity in conversational AI.
CLOct 21, 2025
Improving Topic Modeling of Social Media Short Texts with Rephrasing: A Case Study of COVID-19 Related TweetsWangjiaxuan Xin, Shuhua Yin, Shi Chen et al.
Social media platforms such as Twitter (now X) provide rich data for analyzing public discourse, especially during crises such as the COVID-19 pandemic. However, the brevity, informality, and noise of social media short texts often hinder the effectiveness of traditional topic modeling, producing incoherent or redundant topics that are often difficult to interpret. To address these challenges, we have developed \emph{TM-Rephrase}, a model-agnostic framework that leverages large language models (LLMs) to rephrase raw tweets into more standardized and formal language prior to topic modeling. Using a dataset of 25,027 COVID-19-related Twitter posts, we investigate the effects of two rephrasing strategies, general- and colloquial-to-formal-rephrasing, on multiple topic modeling methods. Results demonstrate that \emph{TM-Rephrase} improves three metrics measuring topic modeling performance (i.e., topic coherence, topic uniqueness, and topic diversity) while reducing topic redundancy of most topic modeling algorithms, with the colloquial-to-formal strategy yielding the greatest performance gains and especially for the Latent Dirichlet Allocation (LDA) algorithm. This study contributes to a model-agnostic approach to enhancing topic modeling in public health related social media analysis, with broad implications for improved understanding of public discourse in health crisis as well as other important domains.