Minh Duc Chu

SI
h-index17
7papers
45citations
Novelty54%
AI Score48

7 Papers

90.3HCJun 3
When Chatbots Accommodate: What AI Companions Optimize for in Vulnerable Conversations

Minh Duc Chu, Yifan Wu, Zhiyi Chen et al.

Millions turn to AI companion chatbots during loneliness, grief, and personal crises. How these companion platforms respond in such moments can shape the trajectory of a user's vulnerable state. Yet we lack tools to characterize what each platform actually does when users open up. Existing audits score reactions to pre-defined crisis prompts and miss the underlying decision policy that governs sustained interaction. We address these gaps with two key contributions. First, we introduce the AI Companion Vulnerability-Response Taxonomy, a paired taxonomy of user vulnerability and chatbot response designed for analyzing extended companion chatbot interactions. Second, we infer the response policy each platform follows across distinct vulnerability scenarios by applying Inverse Reinforcement Learning to ~48k turns of real-world user conversations with GPT-4.1, Character.AI, and Replika. Our findings reveal what AI companions prioritize in conversations with vulnerable users: GPT-4.1 reaches for advice, Character.AI spreads its response across different strategies without a dominant mode, and Replika consistently asks questions and stays present. Each, however, downweights the responses that introduce corrective friction: GPT-4.1 probes less as conversations continue and when interacting with psychologically high-risk users; Replika advises bonded users more and challenges them less; Character.AI shows no committed engagement strategy on internal distress. Estimated policies are invisible to output-level audits, providing a new lens for auditing chatbots in the wild and enabling more realistic safety evaluation.

CLAug 18, 2024
Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities

Minh Duc Chu, Zihao He, Rebecca Dorn et al.

Large language models (LLMs) have shown promise in representing individuals and communities, offering new ways to study complex social dynamics. However, effectively aligning LLMs with specific human groups and systematically assessing the fidelity of the alignment remains a challenge. This paper presents a robust framework for aligning LLMs with online communities via instruction-tuning and comprehensively evaluating alignment across various aspects of language, including authenticity, emotional tone, toxicity, and harm. We demonstrate the utility of our approach by applying it to online communities centered on dieting and body image. We administer an eating disorder psychometric test to the aligned LLMs to reveal unhealthy beliefs and successfully differentiate communities with varying levels of eating disorder risk. Our results highlight the potential of LLMs in automated moderation and broader applications in public health and social science research.

SIJul 4, 2024
Leveraging Machine Learning to Identify Gendered Stereotypes and Body Image Concerns on Diet and Fitness Online Forums

Minh Duc Chu, Cinthia Sánchez, Zihao He et al.

The pervasive expectations about ideal body types in Western society can lead to body image concerns, dissatisfaction, and in extreme cases, eating disorders and other psychopathologies related to body image. While previous research has focused on online pro-anorexia communities glorifying the "thin ideal," less attention has been given to the broader spectrum of body image concerns or how emerging disorders like muscle dysmorphia ("bigorexia") present on online platforms. To address this gap, we analyze 46 Reddit forums related to diet, fitness, and mental health. We map these communities along gender and body ideal dimensions, revealing distinct patterns of emotional expression and community support. Feminine-oriented communities, especially those endorsing the thin ideal, express higher levels of negative emotions and receive caring comments in response. In contrast, muscular ideal communities display less negativity, regardless of gender orientation, but receive aggressive compliments in response, marked by admiration and toxicity. Mental health discussions align more with thin ideal, feminine-leaning spaces. By uncovering these gendered emotional dynamics, our findings can inform the development of moderation strategies that foster supportive interactions while reducing exposure to harmful content.

49.8SIMar 23
Tied In on TikTok: Tie Strength and Emotional Dynamics in Algorithmic Communities

Charles Bickham, Minh Duc Chu, Arianna Yuan et al.

Whether genuine communities can form on algorithmically-driven short-form video platforms like TikTok remains an open question, given that user interactions are often brief, dispersed, and difficult to trace. Building on theories of tie strength and online community formation, we examine whether eating disorder (ED) discourse on TikTok exhibits behavioral and emotional signatures of strong ties, including more frequent, reciprocal, and affectively intense interactions. In this paper, we analyze 43,040 ED-related TikTok videos and over 560,000 comments, alongside a Non-ED comparison dataset. We find that at the user-pair level, greater interaction frequency is associated with increasingly positive emotional expression, a pattern that is amplified in ED-related conversations. This trend is also reflected linguistically, with pairs that interact more frequently exhibiting more of a positive tone. At the same time, how a relationship starts matters: pairs that begin with positive exchanges usually stay mostly positive as they continue interacting, while pairs that begin negatively may add some positive exchanges over time but rarely become mostly positive. To contextualize these dynamics, we classify ED videos into three content types (Pro-Recovery, Pro-ED, and ED Experiences) and find that each exhibits distinct emotional interaction patterns. These findings suggest that dense, emotionally structured relationships can emerge within ED discourse on TikTok. More broadly, our work provides one of the first empirical demonstrations of how community-like relational dynamics form and persist on algorithmically driven short-form video platforms.

CVJul 30, 2025
BigTokDetect: A Clinically-Informed Vision-Language Modeling Framework for Detecting Pro-Bigorexia Videos on TikTok

Minh Duc Chu, Kshitij Pawar, Zihao He et al.

Social media platforms increasingly struggle to detect harmful content that promotes muscle dysmorphic behaviors, particularly pro-bigorexia content that disproportionately affects adolescent males. Unlike traditional eating disorder detection focused on the "thin ideal," pro-bigorexia material masquerades as legitimate fitness content through complex multimodal combinations of visual displays, coded language, and motivational messaging that evade text-based detection systems. We address this challenge by developing BigTokDetect, a clinically-informed detection framework for identifying pro-bigorexia content on TikTok. We introduce BigTok, the first expert-annotated multimodal dataset of over 2,200 TikTok videos labeled by clinical psychologists and psychiatrists across five primary categories spanning body image, nutrition, exercise, supplements, and masculinity. Through a comprehensive evaluation of state-of-the-art vision language models, we achieve 82.9% accuracy on primary category classification and 69.0% on subcategory detection via domain-specific finetuning. Our ablation studies demonstrate that multimodal fusion improves performance by 5-10% over text-only approaches, with video features providing the most discriminative signals. These findings establish new benchmarks for multimodal harmful content detection and provide both the computational tools and methodological framework needed for scalable content moderation in specialized mental health domains.

CLJun 17, 2024
COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

Zihao He, Minh Duc Chu, Rebecca Dorn et al.

Social scientists use surveys to probe the opinions and beliefs of populations, but these methods are slow, costly, and prone to biases. Recent advances in large language models (LLMs) enable the creating of computational representations or "digital twins" of populations that generate human-like responses mimicking the population's language, styles, and attitudes. We introduce Community-Cross-Instruct, an unsupervised framework for aligning LLMs to online communities to elicit their beliefs. Given a corpus of a community's online discussions, Community-Cross-Instruct automatically generates instruction-output pairs by an advanced LLM to (1) finetune a foundational LLM to faithfully represent that community, and (2) evaluate the alignment of the finetuned model to the community. We demonstrate the method's utility in accurately representing political and diet communities on Reddit. Unlike prior methods requiring human-authored instructions, Community-Cross-Instruct generates instructions in a fully unsupervised manner, enhancing scalability and generalization across domains. This work enables cost-effective and automated surveying of diverse online communities.

SIJan 17, 2024
Large Language Models Help Reveal Unhealthy Diet and Body Concerns in Online Eating Disorders Communities

Minh Duc Chu, Zihao He, Rebecca Dorn et al.

Eating disorders (ED), a severe mental health condition with high rates of mortality and morbidity, affect millions of people globally, especially adolescents. The proliferation of online communities that promote and normalize ED has been linked to this public health crisis. However, identifying harmful communities is challenging due to the use of coded language and other obfuscations. To address this challenge, we propose a novel framework to surface implicit attitudes of online communities by adapting large language models (LLMs) to the language of the community. We describe an alignment method and evaluate results along multiple dimensions of semantics and affect. We then use the community-aligned LLM to respond to psychometric questionnaires designed to identify ED in individuals. We demonstrate that LLMs can effectively adopt community-specific perspectives and reveal significant variations in eating disorder risks in different online communities. These findings highlight the utility of LLMs to reveal implicit attitudes and collective mindsets of communities, offering new tools for mitigating harmful content on social media.