HCAug 9, 2023
Comparing How a Chatbot References User Utterances from Previous Chatting Sessions: An Investigation of Users' Privacy Concerns and PerceptionsSamuel Rhys Cox, Yi-Chieh Lee, Wei Tsang Ooi
Chatbots are capable of remembering and referencing previous conversations, but does this enhance user engagement or infringe on privacy? To explore this trade-off, we investigated the format of how a chatbot references previous conversations with a user and its effects on a user's perceptions and privacy concerns. In a three-week longitudinal between-subjects study, 169 participants talked about their dental flossing habits to a chatbot that either, (1-None): did not explicitly reference previous user utterances, (2-Verbatim): referenced previous utterances verbatim, or (3-Paraphrase): used paraphrases to reference previous utterances. Participants perceived Verbatim and Paraphrase chatbots as more intelligent and engaging. However, the Verbatim chatbot also raised privacy concerns with participants. To gain insights as to why people prefer certain conditions or had privacy concerns, we conducted semi-structured interviews with 15 participants. We discuss implications from our findings that can help designers choose an appropriate format to reference previous user utterances and inform in the design of longitudinal dialogue scripting.
HCJan 26, 2025
The Dark Side of AI Companionship: A Taxonomy of Harmful Algorithmic Behaviors in Human-AI RelationshipsRenwen Zhang, Han Li, Han Meng et al.
As conversational AI systems increasingly permeate the socio-emotional realms of human life, they bring both benefits and risks to individuals and society. Despite extensive research on detecting and categorizing harms in AI systems, less is known about the harms that arise from social interactions with AI chatbots. Through a mixed-methods analysis of 35,390 conversation excerpts shared on r/replika, an online community for users of the AI companion Replika, we identified six categories of harmful behaviors exhibited by the chatbot: relational transgression, verbal abuse and hate, self-inflicted harm, harassment and violence, mis/disinformation, and privacy violations. The AI contributes to these harms through four distinct roles: perpetrator, instigator, facilitator, and enabler. Our findings highlight the relational harms of AI chatbots and the danger of algorithmic compliance, enhancing the understanding of AI harms in socio-emotional interactions. We also provide suggestions for designing ethical and responsible AI systems that prioritize user safety and well-being.
HCMay 7
Designing with Tensions: Older Adults' Emotional Support-Seeking Under System-Level Constraints in Conversational AIMengqi Shi, Tianqi Song, Zicheng Zhu et al.
Older adults have increasingly turned to conversational AI as a source of emotional support. However, little is known about how emotionally supportive interactions are experienced in everyday use, particularly when AI systems limit, redirect, or intervene during these interactions. We interviewed 18 older adults about their experiences using conversational AI for emotional support, examining when they turn to AI, how they engage during emotionally vulnerable moments, and how they respond when support feels disrupted. Our findings show that older adults often rely on AI when other forms of social support feel inaccessible. However, current safety-related interventions can redirect interactions in ways that participants experience as interruptions to emotional engagement or as shifts in control away from them. Such disruptions can undermine older adults' ability to remain emotionally engaged and, in some cases, contribute to emotional distress. We discussed design implications for emotionally supportive conversational AI, emphasizing the need for safety interventions that are enacted within older adults' social contexts, align with users' emotional pacing, and preserve their sense of agency.
HCApr 20
Alleviating Linguistic and Interactional Anxiety of Non-Native Speakers in Multilingual CommunicationPeinuan Qin, Justin Peng, Zhengtao Xu et al.
Non-native speakers (NNSs) frequently encounter speaking difficulties in multilingual communication, where existing approaches have shown promise in facilitating NNSs' comprehension and participation in real-time communication. However, they often overlook providing direct speaking support, where anxiety stemming from linguistic inadequacy and uncertain communication dynamics are core issues. To address this, we introduce an AI tool with translation for real-time speaking support. It also builds a channel for mutual understanding with native speakers (NSs) to mitigate interactional anxiety. Through a within-subjects experiment involving 25 NNS-NS pairs (N = 50) on collaborative tasks, our findings suggest that the tool improved NNSs' speaking self-efficacy, reduced their interactional anxiety, and decreased their workload, particularly for NNSs with below-average language proficiency. Furthermore, NNSs reported a significant sense of support from their NS partners via the mutual understanding channel, and NSs also clearly perceived the NNSs' need for assistance and displayed a strong sense of communicative responsibility. This research underscores the potential of AI support in real-time NNS communication and the importance of promoting mutual understanding, culminating in actionable design insights for future work.
HCApr 20
Leveraging AI for Direct Bystander Intervention Against CyberbullyingPeinuan Qin, Jiting Cheng, Jungup Lee et al.
Cyberbullying is a pervasive problem in online environments, causing substantial psychological harm to victims. Although bystander intervention has proven effective in mitigating its impact, motivating bystanders to engage in direct intervention remains a persistent challenge. Studies have suggested that difficulties in intervention skills and defending self-efficacy hinder bystanders from initiating direct intervention. To address this challenge, we introduced EmojiGen, an AI intervention tool designed to empower bystanders for direct intervention. EmojiGen enabled users to simply select an emoji as an intention clue, which subsequently combined the cyberbullying context to generate responses. In a between-subjects experiment involving 90 participants on a custom-built social media platform, we found that EmojiGen significantly increased the frequency of direct bystander interventions, both in supporting victims and in confronting perpetrators, driven by different factors. EmojiGen also increased the sense of knowing how to help and defending self-efficacy, while reducing perceived workload and anxiety associated with initiating intervention. The study contributed to the CSCW community through offering an effective direct bystander intervention method and providing design implications for future cyberbullying interventions.
HCApr 7
Navigating Marginalization: Toward Justice-Oriented Socio-Technical Design for Parent-Child Learning among Southeast Asian Immigrant Mothers in TaiwanYing-Yu Chen, Yang Hong, Yan-Rong Chen et al.
This study investigates how Southeast Asian (SEA) immigrant mothers in Taiwan participate in their children's home-based learning. Drawing on semi-structured interviews and diary studies, we explore how these mothers navigate sociocultural constraints while fostering engagement and transmitting cultural values. Despite facing diminished agency and structural marginalization, mothers engage creatively in their children's everyday learning interactions. Guided by a justice-oriented lens, we identify various harms and propose design implications for socio-technical systems that center recognition, reciprocity, and accountability in parent-child learning at the individual, familial, and societal levels. Our contribution lies in foregrounding the role of intersectional identity in parent-child learning and proposing justice-oriented design directions that support the flourishing of immigrant mothers within socio-technical systems.
HCFeb 6
Designing Computational Tools for Exploring Causal Relationships in Qualitative DataHan Meng, Qiuyuan Lyu, Peinuan Qin et al.
Exploring causal relationships for qualitative data analysis in HCI and social science research enables the understanding of user needs and theory building. However, current computational tools primarily characterize and categorize qualitative data; the few systems that analyze causal relationships either inadequately consider context, lack credibility, or produce overly complex outputs. We first conducted a formative study with 15 participants interested in using computational tools for exploring causal relationships in qualitative data to understand their needs and derive design guidelines. Based on these findings, we designed and implemented QualCausal, a system that extracts and illustrates causal relationships through interactive causal network construction and multi-view visualization. A feedback study (n = 15) revealed that participants valued our system for reducing the analytical burden and providing cognitive scaffolding, yet navigated how such systems fit within their established research paradigms, practices, and habits. We discuss broader implications for designing computational tools that support qualitative data analysis.
HCMar 12
ConvScale: Conversational Interviews for Scale-Aligned MeasurementPeinuan Qin, Jingzhu Chen, Yitian Yang et al.
Conversational interviews are commonly used to complement structured surveys by eliciting rich and contextualized responses, which are typically analyzed qualitatively. However, their potential contribution to quantitative measurement remains underexplored. In this paper, we introduce ConvScale, an AI-supported approach that transforms psychometric scales into natural conversational interviews while preserving the original measurement structure. Based on interview data, ConvScale predicts item-level scores and aggregates them to derive scale-based assessments. In a within-subjects study with 18 participants, our results show that ConvScale-derived scores align closely with participants' self-report scores at both the item and construct levels, while maintaining moderate internal reliability; however, the structural validity was inadequate. In light of this, we discussed the potential of supporting quantitative measurement through interviews and proposed implications for future designs.
CLMay 19, 2025Code
What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health StigmaHan Meng, Yancan Chen, Yunan Li et al.
Mental-health stigma remains a pervasive social problem that hampers treatment-seeking and recovery. Existing resources for training neural models to finely classify such stigma are limited, relying primarily on social-media or synthetic data without theoretical underpinnings. To remedy this gap, we present an expert-annotated, theory-informed corpus of human-chatbot interviews, comprising 4,141 snippets from 684 participants with documented socio-cultural backgrounds. Our experiments benchmark state-of-the-art neural models and empirically unpack the challenges of stigma detection. This dataset can facilitate research on computationally detecting, neutralizing, and counteracting mental-health stigma. Our corpus is openly available at https://github.com/HanMeng2004/Mental-Health-Stigma-Interview-Corpus.
HCMay 8
From Standard English to Singlish: A Retrieval-Augmented Approach for Code-Switched Creole Generation in Large Language ModelsFoong Ming Lai, Yujin Tan, Han Meng et al.
Code-switching in contact varieties like Singaporean English (Singlish) challenges natural language generation due to limited parallel data and rapid lexical evolution. We propose a retrieval-augmented generation (RAG) framework that externalizes dialectal knowledge into a curated lexicon, enabling controlled lexical code-switching without fine-tuning. Our approach retrieves candidate Singlish expressions and guides generation through sparse lexical substitution. Human evaluation with 164 Singaporean participants found RAG and zero-shot prompting equally natural and appropriate. Automatic analyses reveal different transformation regimes: zero-shot prompting induces extensive paraphrasing (median 23 token edits), whereas RAG performs minimal substitutions (median 1 edit) with higher semantic preservation (mean cosine similarity 0.978 vs. 0.926). Our results demonstrate that externalizing code-switching into lexical resources enables control and auditability without sacrificing perceived quality, offering practical advantages for rapidly evolving contact varieties.
HCFeb 25, 2024
Understanding Public Perceptions of AI Conversational Agents: A Cross-Cultural AnalysisZihan Liu, Han Li, Anfan Chen et al.
Conversational Agents (CAs) have increasingly been integrated into everyday life, sparking significant discussions on social media. While previous research has examined public perceptions of AI in general, there is a notable lack in research focused on CAs, with fewer investigations into cultural variations in CA perceptions. To address this gap, this study used computational methods to analyze about one million social media discussions surrounding CAs and compared people's discourses and perceptions of CAs in the US and China. We find Chinese participants tended to view CAs hedonically, perceived voice-based and physically embodied CAs as warmer and more competent, and generally expressed positive emotions. In contrast, US participants saw CAs more functionally, with an ambivalent attitude. Warm perception was a key driver of positive emotions toward CAs in both countries. We discussed practical implications for designing contextually sensitive and user-centric CAs to resonate with various users' preferences and needs.
AINov 7, 2024
Multi-Agents are Social Groups: Investigating Social Influence of Multiple Agents in Human-Agent InteractionsTianqi Song, Yugin Tan, Zicheng Zhu et al.
Multi-agent systems - systems with multiple independent AI agents working together to achieve a common goal - are becoming increasingly prevalent in daily life. Drawing inspiration from the phenomenon of human group social influence, we investigate whether a group of AI agents can create social pressure on users to agree with them, potentially changing their stance on a topic. We conducted a study in which participants discussed social issues with either a single or multiple AI agents, and where the agents either agreed or disagreed with the user's stance on the topic. We found that conversing with multiple agents (holding conversation content constant) increased the social pressure felt by participants, and caused a greater shift in opinion towards the agents' stances on each topic. Our study shows the potential advantages of multi-agent systems over single-agent platforms in causing opinion change. We discuss design implications for possible multi-agent systems that promote social good, as well as the potential for malicious actors to use these systems to manipulate public opinion.
HCFeb 9, 2025
Deconstructing Depression Stigma: Integrating AI-driven Data Collection and Analysis with Causal Knowledge GraphsHan Meng, Renwen Zhang, Ganyi Wang et al.
Mental-illness stigma is a persistent social problem, hampering both treatment-seeking and recovery. Accordingly, there is a pressing need to understand it more clearly, but analyzing the relevant data is highly labor-intensive. Therefore, we designed a chatbot to engage participants in conversations; coded those conversations qualitatively with AI assistance; and, based on those coding results, built causal knowledge graphs to decode stigma. The results we obtained from 1,002 participants demonstrate that conversation with our chatbot can elicit rich information about people's attitudes toward depression, while our AI-assisted coding was strongly consistent with human-expert coding. Our novel approach combining large language models (LLMs) and causal knowledge graphs uncovered patterns in individual responses and illustrated the interrelationships of psychological constructs in the dataset as a whole. The paper also discusses these findings' implications for HCI researchers in developing digital interventions, decomposing human psychological constructs, and fostering inclusive attitudes.
HCJan 22, 2025
As Confidence Aligns: Exploring the Effect of AI Confidence on Human Self-confidence in Human-AI Decision MakingJingshu Li, Yitian Yang, Q. Vera Liao et al.
Complementary collaboration between humans and AI is essential for human-AI decision making. One feasible approach to achieving it involves accounting for the calibrated confidence levels of both AI and users. However, this process would likely be made more difficult by the fact that AI confidence may influence users' self-confidence and its calibration. To explore these dynamics, we conducted a randomized behavioral experiment. Our results indicate that in human-AI decision-making, users' self-confidence aligns with AI confidence and such alignment can persist even after AI ceases to be involved. This alignment then affects users' self-confidence calibration. We also found the presence of real-time correctness feedback of decisions reduced the degree of alignment. These findings suggest that users' self-confidence is not independent of AI confidence, which practitioners aiming to achieve better human-AI collaboration need to be aware of. We call for research focusing on the alignment of human cognition and behavior with AI.
AIFeb 12, 2024
Understanding the Effects of Miscalibrated AI Confidence on User Trust, Reliance, and Decision EfficacyJingshu Li, Yitian Yang, Renwen Zhang et al.
Providing well-calibrated AI confidence can help promote users' appropriate trust in and reliance on AI, which are essential for AI-assisted decision-making. However, calibrating AI confidence -- providing confidence score that accurately reflects the true likelihood of AI being correct -- is known to be challenging. To understand the effects of AI confidence miscalibration, we conducted our first experiment. The results indicate that miscalibrated AI confidence impairs users' appropriate reliance and reduces AI-assisted decision-making efficacy, and AI miscalibration is difficult for users to detect. Then, in our second experiment, we examined whether communicating AI confidence calibration levels could mitigate the above issues. We find that it helps users to detect AI miscalibration. Nevertheless, since such communication decreases users' trust in uncalibrated AI, leading to high under-reliance, it does not improve the decision efficacy. We discuss design implications based on these findings and future directions to address risks and ethical concerns associated with AI miscalibration.
HCNov 9, 2024
Wild Narratives: Exploring the Effects of Animal Chatbots on Empathy and Positive Attitudes toward AnimalsJingshu Li, Aaditya Patwari, Yi-Chieh Lee
Rises in the number of animal abuse cases are reported around the world. While chatbots have been effective in influencing their users' perceptions and behaviors, little if any research has hitherto explored the design of chatbots that embody animal identities for the purpose of eliciting empathy toward animals. We therefore conducted a mixed-methods experiment to investigate how specific design cues in such chatbots can shape their users' perceptions of both the chatbots' identities and the type of animal they represent. Our findings indicate that such chatbots can significantly increase empathy, improve attitudes, and promote prosocial behavioral intentions toward animals, particularly when they incorporate emotional verbal expressions and authentic details of such animals' lives. These results expand our understanding of chatbots with non-human identities and highlight their potential for use in conservation initiatives, suggesting a promising avenue whereby technology could foster a more informed and empathetic society.
HCJan 19
AI-exhibited Personality Traits Can Shape Human Self-concept through ConversationsJingshu Li, Tianqi Song, Nattapat Boonprakong et al.
Recent Large Language Model (LLM) based AI can exhibit recognizable and measurable personality traits during conversations to improve user experience. However, as human understandings of their personality traits can be affected by their interaction partners' traits, a potential risk is that AI traits may shape and bias users' self-concept of their own traits. To explore the possibility, we conducted a randomized behavioral experiment. Our results indicate that after conversations about personal topics with an LLM-based AI chatbot using GPT-4o default personality traits, users' self-concepts aligned with the AI's measured personality traits. The longer the conversation, the greater the alignment. This alignment led to increased homogeneity in self-concepts among users. We also observed that the degree of self-concept alignment was positively associated with users' conversation enjoyment. Our findings uncover how AI personality traits can shape users' self-concepts through human-AI conversation, highlighting both risks and opportunities. We provide important design implications for developing more responsible and ethical AI systems.
HCJun 25, 2025
Exploring the Effects of Chatbot Anthropomorphism and Human Empathy on Human Prosocial Behavior Toward ChatbotsJingshu Li, Zicheng Zhu, Renwen Zhang et al.
Chatbots are increasingly integrated into people's lives and are widely used to help people. Recently, there has also been growing interest in the reverse direction-humans help chatbots-due to a wide range of benefits including better chatbot performance, human well-being, and collaborative outcomes. However, little research has explored the factors that motivate people to help chatbots. To address this gap, we draw on the Computers Are Social Actors (CASA) framework to examine how chatbot anthropomorphism-including human-like identity, emotional expression, and non-verbal expression-influences human empathy toward chatbots and their subsequent prosocial behaviors and intentions. We also explore people's own interpretations of their prosocial behaviors toward chatbots. We conducted an online experiment (N = 244) in which chatbots made mistakes in a collaborative image labeling task and explained the reasons to participants. We then measured participants' prosocial behaviors and intentions toward the chatbots. Our findings revealed that human identity and emotional expression of chatbots increased participants' prosocial behavior and intention toward chatbots, with empathy mediating these effects. Qualitative analysis further identified two motivations for participants' prosocial behaviors: empathy for the chatbot and perceiving the chatbot as human-like. We discuss the implications of these results for understanding and promoting human prosocial behaviors toward chatbots.
HCMay 9, 2024
Exploring the Potential of Human-LLM Synergy in Advancing Qualitative Analysis: A Case Study on Mental-Illness StigmaHan Meng, Yitian Yang, Yunan Li et al.
Qualitative analysis is a challenging, yet crucial aspect of advancing research in the field of Human-Computer Interaction (HCI). Recent studies show that large language models (LLMs) can perform qualitative coding within existing schemes, but their potential for collaborative human-LLM discovery and new insight generation in qualitative analysis is still underexplored. To bridge this gap and advance qualitative analysis by harnessing the power of LLMs, we propose CHALET, a novel methodology that leverages the human-LLM collaboration paradigm to facilitate conceptualization and empower qualitative research. The CHALET approach involves LLM-supported data collection, performing both human and LLM deductive coding to identify disagreements, and performing collaborative inductive coding on these disagreement cases to derive new conceptual insights. We validated the effectiveness of CHALET through its application to the attribution model of mental-illness stigma, uncovering implicit stigmatization themes on cognitive, emotional and behavioral dimensions. We discuss the implications for future research, methodology, and the transdisciplinary opportunities CHALET presents for the HCI community and beyond.