CLAug 25, 2023
Prompting a Large Language Model to Generate Diverse Motivational Messages: A Comparison with Human-Written MessagesSamuel Rhys Cox, Ashraf Abdul, Wei Tsang Ooi
Large language models (LLMs) are increasingly capable and prevalent, and can be used to produce creative content. The quality of content is influenced by the prompt used, with more specific prompts that incorporate examples generally producing better results. On from this, it could be seen that using instructions written for crowdsourcing tasks (that are specific and include examples to guide workers) could prove effective LLM prompts. To explore this, we used a previous crowdsourcing pipeline that gave examples to people to help them generate a collectively diverse corpus of motivational messages. We then used this same pipeline to generate messages using GPT-4, and compared the collective diversity of messages from: (1) crowd-writers, (2) GPT-4 using the pipeline, and (3 & 4) two baseline GPT-4 prompts. We found that the LLM prompts using the crowdsourcing pipeline caused GPT-4 to produce more diverse messages than the two baseline prompts. We also discuss implications from messages generated by both human writers and LLMs.
HCAug 9, 2023
Comparing How a Chatbot References User Utterances from Previous Chatting Sessions: An Investigation of Users' Privacy Concerns and PerceptionsSamuel Rhys Cox, Yi-Chieh Lee, Wei Tsang Ooi
Chatbots are capable of remembering and referencing previous conversations, but does this enhance user engagement or infringe on privacy? To explore this trade-off, we investigated the format of how a chatbot references previous conversations with a user and its effects on a user's perceptions and privacy concerns. In a three-week longitudinal between-subjects study, 169 participants talked about their dental flossing habits to a chatbot that either, (1-None): did not explicitly reference previous user utterances, (2-Verbatim): referenced previous utterances verbatim, or (3-Paraphrase): used paraphrases to reference previous utterances. Participants perceived Verbatim and Paraphrase chatbots as more intelligent and engaging. However, the Verbatim chatbot also raised privacy concerns with participants. To gain insights as to why people prefer certain conditions or had privacy concerns, we conducted semi-structured interviews with 15 participants. We discuss implications from our findings that can help designers choose an appropriate format to reference previous user utterances and inform in the design of longitudinal dialogue scripting.
HCJan 28
Polite But Boring? Trade-offs Between Engagement and Psychological Reactance to Chatbot Feedback StylesSamuel Rhys Cox, Joel Wester, Niels van Berkel
As conversational agents become increasingly common in behaviour change interventions, understanding optimal feedback delivery mechanisms becomes increasingly important. However, choosing a style that both lessens psychological reactance (perceived threats to freedom) while simultaneously eliciting feelings of surprise and engagement represents a complex design problem. We explored how three different feedback styles: 'Direct', 'Politeness', and 'Verbal Leakage' (slips or disfluencies to reveal a desired behaviour) affect user perceptions and behavioural intentions. Matching expectations from literature, the 'Direct' chatbot led to lower behavioural intentions and higher reactance, while the 'Politeness' chatbot evoked higher behavioural intentions and lower reactance. However, 'Politeness' was also seen as unsurprising and unengaging by participants. In contrast, 'Verbal Leakage' evoked reactance, yet also elicited higher feelings of surprise, engagement, and humour. These findings highlight that effective feedback requires navigating trade-offs between user reactance and engagement, with novel approaches such as 'Verbal Leakage' offering promising alternative design opportunities.
HCMay 23
"It Felt a Bit Eerie": Exploring Humanlike Interactions During Collaborative Writing with an Artificial AgentMichael Yin, Angela Chiang, Samuel Rhys Cox et al.
While human-AI collaboration systems have increasingly been built to increase efficiency or support creativity, little work has examined how the design of interactions shapes the social connection between human and artificial agent. We examine how the temporal and visual dimensions of collaboration shape the experience of a writing task. Specifically, we built three variants of an AI-assisted text editor along a spectrum of simulated humanlike interaction (synchronous and with a cursor) to machinelike interaction (asynchronous and without a cursor), and conducted a comparative user study (n=48). Our exploratory findings suggest that synchronous suggestions increased efficiency but led to contextual misalignment, while a visual cursor increased intent understanding but evoked feelings of surveillance. Taken together, humanlike design of artificial agents can create positive social expectations but also elicit social costs, especially without the alignment present in human-human collaboration. We extend our findings into design implications and ethical considerations when building human-AI collaboration systems.
HCApr 26
Who Gets to Interpret the Workout? User Tensions with AI-Generated Fitness FeedbackSujay Shalawadi, Joel Wester, Samuel Rhys Cox et al.
Fitness tracking platforms increasingly integrate generative AI to interpret activity data, such as Strava's Athlete Intelligence. These integrations raise questions about how athletes engage with AI-supported fitness self-tracking. We analyzed 297 Reddit threads and 5,692 comments from r/Strava following the company's launch of AI features to examine user reactions to AI-generated fitness feedback. Our findings revealed four recurring tensions: (1) numerical evaluation versus contextual understanding; (2) isolated session summaries versus ongoing training narratives; (3) a fixed AI tone versus diverse emotional states; and (4) a single AI voice versus different athletic types. Across these tensions, users resisted AI feedback that constrained interpretations of their own lived experiences. These findings shed light on the implicit challenges of integrating AI into self-tracking platforms. We conclude with implications for the design of AI-supported self-tracking systems that preserve interpretive openness and user agency.
HCFeb 3
Chaplains' Reflections on the Design and Usage of AI for Conversational CareJoel Wester, Samuel Rhys Cox, Henning Pohl et al.
Despite growing recognition that responsible AI requires domain knowledge, current work on conversational AI primarily draws on clinical expertise that prioritises diagnosis and intervention. However, much of everyday emotional support needs occur in non-clinical contexts, and therefore requires different conversational approaches. We examine how chaplains, who guide individuals through personal crises, grief, and reflection, perceive and engage with conversational AI. We recruited eighteen chaplains to build AI chatbots. While some chaplains viewed chatbots with cautious optimism, the majority expressed limitations of chatbots' ability to support everyday well-being. Our analysis reveals how chaplains perceive their pastoral care duties and areas where AI chatbots fall short, along the themes of Listening, Connecting, Carrying, and Wanting. These themes resonate with the idea of attunement, recently highlighted as a relational lens for understanding the delicate experiences care technologies provide. This perspective informs chatbot design aimed at supporting well-being in non-clinical contexts.
HCJul 14, 2025
Theory of Mind and Self-Disclosure to CUIsSamuel Rhys Cox
Self-disclosure is important to help us feel better, yet is often difficult. This difficulty can arise from how we think people are going to react to our self-disclosure. In this workshop paper, we briefly discuss self-disclosure to conversational user interfaces (CUIs) in relation to various social cues. We then, discuss how expressions of uncertainty or representation of a CUI's reasoning could help encourage self-disclosure, by making a CUI's intended "theory of mind" more transparent to users.
HCApr 21, 2024
Incorporating Different Verbal Cues to Improve Text-Based Computer-Delivered Health MessagingSamuel Rhys Cox
The ubiquity of smartphones has led to an increase in on demand healthcare being supplied. For example, people can share their illness-related experiences with others similar to themselves, and healthcare experts can offer advice for better treatment and care for remediable, terminal and mental illnesses. As well as this human-to-human communication, there has been an increased use of human-to-computer digital health messaging, such as chatbots. These can prove advantageous as they offer synchronous and anonymous feedback without the need for a human conversational partner. However, there are many subtleties involved in human conversation that a computer agent may not properly exhibit. For example, there are various conversational styles, etiquettes, politeness strategies or empathic responses that need to be chosen appropriately for the conversation. Encouragingly, computers are social actors (CASA) posits that people apply the same social norms to computers as they would do to people. On from this, previous studies have focused on applying conversational strategies to computer agents to make them embody more favourable human characteristics. However, if a computer agent fails in this regard it can lead to negative reactions from users. Therefore, in this dissertation we describe a series of studies we carried out to lead to more effective human-to-computer digital health messaging. In our first study, we use the crowd [...] Our second study investigates the effect of a health chatbot's conversational style [...] In our final study, we investigate the format used by a chatbot when [...] In summary, we have researched how to create more effective digital health interventions starting from generating health messages, to choosing an appropriate formality of messaging, and finally to formatting messages which reference a user's previous utterances.
HCJan 15, 2021
Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd IdeationSamuel Rhys Cox, Yunlong Wang, Ashraf Abdul et al.
Crowdsourcing can collect many diverse ideas by prompting ideators individually, but this can generate redundant ideas. Prior methods reduce redundancy by presenting peers' ideas or peer-proposed prompts, but these require much human coordination. We introduce Directed Diversity, an automatic prompt selection approach that leverages language model embedding distances to maximize diversity. Ideators can be directed towards diverse prompts and away from prior ideas, thus improving their collective creativity. Since there are diverse metrics of diversity, we present a Diversity Prompting Evaluation Framework consolidating metrics from several research disciplines to analyze along the ideation chain - prompt selection, prompt creativity, prompt-ideation mediation, and ideation creativity. Using this framework, we evaluated Directed Diversity in a series of a simulation study and four user studies for the use case of crowdsourcing motivational messages to encourage physical activity. We show that automated diverse prompting can variously improve collective creativity across many nuanced metrics of diversity.