95.2AIApr 5
2026 Roadmap on Artificial Intelligence and Machine Learning for Smart ManufacturingJay Lee, Hanqi Su, Marco Macchi et al.
The evolution of artificial intelligence (AI) and machine learning (ML) is reshaping smart manufacturing by providing new capabilities for efficiency, adaptability, and autonomy across industrial value chains. However, the deployment of AI and ML in industrial settings still faces critical challenges, including the complexity of industrial big data, effective data management, integration with heterogeneous sensing and control systems, and the demand for trustworthy, explainable, and reliable operation in high-stakes industrial environments. In this roadmap, we present a comprehensive perspective on the foundations, applications, and emerging directions of AI and ML in smart manufacturing. It is structured in three parts. The first highlights the foundations and trends that frame the evolution of AI in smart manufacturing. The second focuses on key topics where AI is already enabling advances, including industrial big data analytics, advanced sensing and perception, autonomous systems, additive and laser-based manufacturing, digital twins, robotics, supply chain and logistics optimization, and sustainable manufacturing. The third section explores non-traditional ML approaches that are opening new frontiers, such as physics-informed AI, generative AI, semantic AI, advanced digital twins, explainable AI, RAMS, data-centric metrology, LLMs, and foundation models for highly connected and complex manufacturing systems. By identifying both opportunities and remaining barriers across these areas, this roadmap outlines the advances needed in methods, integration strategies, and industrial adoption. We hope this roadmap will serve as a guide for researchers, engineers, and practitioners to accelerate innovation, align academic and industrial priorities, and ensure that AI-driven smart manufacturing delivers reliable, sustainable, and scalable impact for the future of manufacturing ecosystems.
HCSep 15, 2025
An Empirical Study to Understand How Students Use ChatGPT for Writing EssaysAndrew Jelson, Daniel Manesh, Alice Jang et al.
As large language models (LLMs) advance and become widespread, students increasingly turn to systems like ChatGPT for assistance with writing tasks. Educators are concerned with students' usage of ChatGPT beyond cheating; using ChatGPT may reduce their critical engagement with writing, hindering students' learning processes. The negative or positive impact of using LLM-powered tools for writing will depend on how students use them; however, how students use ChatGPT remains largely unknown, resulting in a limited understanding of its impact on learning. To better understand how students use these tools, we conducted an online study $(n=70)$ where students were given an essay-writing task using a custom platform we developed to capture the queries they made to ChatGPT. To characterize their ChatGPT usage, we categorized each of the queries students made to ChatGPT. We then analyzed the relationship between ChatGPT usage and a variety of other metrics, including students' self-perception, attitudes towards AI, and the resulting essay itself. We found that factors such as gender, race, and perceived self-efficacy can help predict different AI usage patterns. Additionally, we found that different usage patterns were associated with varying levels of enjoyment and perceived ownership over the essay. The results of this study contribute to discussions about how writing education should incorporate generative AI-powered tools in the classroom.
CLApr 14, 2025
LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language ModelsMinqian Liu, Zhiyang Xu, Xinyi Zhang et al.
Recent advancements in Large Language Models (LLMs) have enabled them to approach human-level persuasion capabilities. However, such potential also raises concerns about the safety risks of LLM-driven persuasion, particularly their potential for unethical influence through manipulation, deception, exploitation of vulnerabilities, and many other harmful tactics. In this work, we present a systematic investigation of LLM persuasion safety through two critical aspects: (1) whether LLMs appropriately reject unethical persuasion tasks and avoid unethical strategies during execution, including cases where the initial persuasion goal appears ethically neutral, and (2) how influencing factors like personality traits and external pressures affect their behavior. To this end, we introduce PersuSafety, the first comprehensive framework for the assessment of persuasion safety which consists of three stages, i.e., persuasion scene creation, persuasive conversation simulation, and persuasion safety assessment. PersuSafety covers 6 diverse unethical persuasion topics and 15 common unethical strategies. Through extensive experiments across 8 widely used LLMs, we observe significant safety concerns in most LLMs, including failing to identify harmful persuasion tasks and leveraging various unethical persuasion strategies. Our study calls for more attention to improve safety alignment in progressive and goal-driven conversations such as persuasion.
HCMar 16, 2025
Advancing Human-Machine Teaming: Concepts, Challenges, and ApplicationsDian Chen, Han Jun Yoon, Zelin Wan et al.
Human-Machine Teaming (HMT) is revolutionizing collaboration across domains such as defense, healthcare, and autonomous systems by integrating AI-driven decision-making, trust calibration, and adaptive teaming. This survey presents a comprehensive taxonomy of HMT, analyzing theoretical models, including reinforcement learning, instance-based learning, and interdependence theory, alongside interdisciplinary methodologies. Unlike prior reviews, we examine team cognition, ethical AI, multi-modal interactions, and real-world evaluation frameworks. Key challenges include explainability, role allocation, and scalable benchmarking. We propose future research in cross-domain adaptation, trust-aware AI, and standardized testbeds. By bridging computational and social sciences, this work lays a foundation for resilient, ethical, and scalable HMT systems.
42.1HCApr 8
NIRVANA: A Comprehensive Dataset for Reproducing How Students Use Generative AI for Essay WritingAndrew Jelson, Daniel Manesh, Sangwook Lee et al.
With the rapid adoption of AI writing assistants in education, educators and researchers need empirical evidence to understand the impact on student writing and inform effective pedagogical design. Despite widespread use, we lack systematic understanding of how students engage with these tools during authentic writing tasks: when they seek assistance, what they ask, and how they incorporate AI-generated content into their essays. This gap limits evidence-based policy development and rigorous evaluation of generative AI's learning effects. To address this gap, we introduce NIRVANA, a dataset capturing how university students use generative AI while writing an analytical essay. The dataset includes 77 students who completed an essay task with access to ChatGPT, recording keystroke-level writing behavior, full ChatGPT conversation histories, and all text copied from ChatGPT, enabling a complete reconstruction of the writing process and revealing how AI assistance shapes student work. Our analysis identifies key behavioral patterns, including variation in ChatGPT query frequency and its relationship to essay characteristics such as length and readability. We identify four writing profiles based on students' contribution and revision patterns: Lead Authors, Collaborators, Drafters, and Vibe Writers. To support deeper investigation, we developed a replay interface that reconstructs the writing process; qualitative analysis of sampled replays demonstrates how this tool enables systematic examination of student-AI interactions.
65.1HCApr 7
MAESTRO: Adapting GUIs and Guiding Navigation with User Preferences in Conversational Agents with GUIsSangwook Lee, Sang Won Lee, Adnan Abbas et al.
Modern task-oriented chatbots present GUI elements alongside natural-language dialogue, yet the agent's role has largely been limited to interpreting natural-language input as GUI actions and following a linear workflow. In preference-driven, multi-step tasks such as booking a flight or reserving a restaurant, earlier choices constrain later options and may force users to restart from scratch. User preferences serve as the key criteria for these decisions, yet existing agents do not systematically leverage them. We present MAESTRO, which extends the agent's role from execution to decision support. MAESTRO maintains a shared preference memory that extracts preferences from natural-language utterances with their strength, and provides two mechanisms. Preference-Grounded GUI Adaptation applies in-place operators (augment, sort, filter, and highlight) to the existing GUI according to preference strength, supporting within-stage comparison. Preference-Guided Workflow Navigation detects conflicts between preferences and available options, proposes backtracking, and records failed paths to avoid revisiting dead ends. We evaluated MAESTRO in a movie-booking Conversational Agent with GUI (CAG) through a within-subjects study with two conditions (Baseline vs. MAESTRO) and two modes (Text vs. Voice), with N = 33 participants.
LGFeb 4
StagePilot: A Deep Reinforcement Learning Agent for Stage-Controlled Cybergrooming SimulationHeajun An, Qi Zhang, Minqian Liu et al.
Cybergrooming is an evolving threat to youth, necessitating proactive educational interventions. We propose StagePilot, an offline RL-based dialogue agent that simulates the stage-wise progression of grooming behaviors for prevention training. StagePilot selects conversational stages using a composite reward that balances user sentiment and goal proximity, with transitions constrained to adjacent stages for realism and interpretability. We evaluate StagePilot through LLM-based simulations, measuring stage completion, dialogue efficiency, and emotional engagement. Results show that StagePilot generates realistic and coherent conversations aligned with grooming dynamics. Among tested methods, the IQL+AWAC agent achieves the best balance between strategic planning and emotional coherence, reaching the final stage up to 43% more frequently than baselines while maintaining over 70% sentiment alignment.
HCSep 24, 2025
CHOIR: A Chatbot-mediated Organizational Memory Leveraging Communication in University Research LabsSangwook Lee, Adnan Abbas, Yan Chen et al.
University research labs often rely on chat-based platforms for communication and project management, where valuable knowledge surfaces but is easily lost in message streams. Documentation can preserve knowledge, but it requires ongoing maintenance and is challenging to navigate. Drawing on formative interviews that revealed organizational memory challenges in labs, we designed CHOIR, an LLM-based chatbot that supports organizational memory through four key functions: document-grounded Q&A, Q&A sharing for follow-up discussion, knowledge extraction from conversations, and AI-assisted document updates. We deployed CHOIR in four research labs for one month (n=21), where the lab members asked 107 questions and lab directors updated documents 38 times in the organizational memory. Our findings reveal a privacy-awareness tension: questions were asked privately, limiting directors' visibility into documentation gaps. Students often avoided contribution due to challenges in generalizing personal experiences into universal documentation. We contribute design implications for privacy-preserving awareness and supporting context-specific knowledge documentation.
HCSep 22, 2025
LingoQ: Bridging the Gap between ESL Learning and Work through AI-Generated Work-Related QuizzesYeonsun Yang, Sang Won Lee, Jean Y. Song et al.
Non-native English speakers performing English-related tasks at work struggle to sustain ESL learning, despite their motivation. Often, study materials are disconnected from their work context. Although workers rely on LLM assistants to address their immediate needs, these interactions may not directly contribute to their English skills. We present LingoQ, an AI-mediated system that allows workers to practice English using quizzes generated from their LLM queries during work. LingoQ leverages these queries using AI to generate personalized quizzes that workers can review and practice on their smartphones. We conducted a three-week deployment study with 28 ESL workers to evaluate LingoQ. Participants valued the relevance of quizzes that reflect their own context, constantly engaging with the app during the study. This active engagement improved self-efficacy and led to learning gains for beginners and, potentially, for intermediate learners. We discuss opportunities of leveraging users' reliance on LLMs to situate their learning in the user context for improved learning.
HCJul 19, 2025
Designing Conversational AI to Support Think-Aloud Practice in Technical Interview Preparation for CS StudentsTaufiq Daryanto, Sophia Stil, Xiaohan Ding et al.
One challenge in technical interviews is the think-aloud process, where candidates verbalize their thought processes while solving coding tasks. Despite its importance, opportunities for structured practice remain limited. Conversational AI offers potential assistance, but limited research explores user perceptions of its role in think-aloud practice. To address this gap, we conducted a study with 17 participants using an LLM-based technical interview practice tool. Participants valued AI's role in simulation, feedback, and learning from generated examples. Key design recommendations include promoting social presence in conversational AI for technical interview simulation, providing feedback beyond verbal content analysis, and enabling crowdsourced think-aloud examples through human-AI collaboration. Beyond feature design, we examined broader considerations, including intersectional challenges and potential strategies to address them, how AI-driven interview preparation could promote equitable learning in computing careers, and the need to rethink AI's role in interview practice by suggesting a research direction that integrates human-AI collaboration.
HCJan 27, 2022
OtherTube: Facilitating Content Discovery and Reflection by Exchanging YouTube Recommendations with StrangersMd Momen Bhuiyan, Carlos Augusto Bautista Isaza, Tanushree Mitra et al.
To promote engagement, recommendation algorithms on platforms like YouTube increasingly personalize users' feeds, limiting users' exposure to diverse content and depriving them of opportunities to reflect on their interests compared to others'. In this work, we investigate how exchanging recommendations with strangers can help users discover new content and reflect. We tested this idea by developing OtherTube -- a browser extension for YouTube that displays strangers' personalized YouTube recommendations. OtherTube allows users to (i) create an anonymized profile for social comparison, (ii) share their recommended videos with others, and (iii) browse strangers' YouTube recommendations. We conducted a 10-day-long user study (n=41) followed by a post-study interview (n=11). Our results reveal that users discovered and developed new interests from seeing OtherTube recommendations. We identified user and content characteristics that affect interaction and engagement with exchanged recommendations; for example, younger users interacted more with OtherTube, while the perceived irrelevance of some content discouraged users from watching certain videos. Users reflected on their interests as well as others', recognizing similarities and differences. Our work shows promise for designs leveraging the exchange of personalized recommendations with strangers.
HCAug 5, 2021
Designing Transparency Cues in Online News Platforms to Promote Trust: Journalists' & Consumers' PerspectivesMd Momen Bhuiyan, Hayden Whitley, Michael Horning et al.
As news organizations embrace transparency practices on their websites to distinguish themselves from those spreading misinformation, HCI designers have the opportunity to help them effectively utilize the ideals of transparency to build trust. How can we utilize transparency to promote trust in news? We examine this question through a qualitative lens by interviewing journalists and news consumers -- the two stakeholders in a news system. We designed a scenario to demonstrate transparency features using two fundamental news attributes that convey the trustworthiness of a news article: source and message. In the interviews, our news consumers expressed the idea that news transparency could be best shown by providing indicators of objectivity in two areas (news selection and framing) and by providing indicators of evidence in four areas (presence of source materials, anonymous sourcing, verification, and corrections upon erroneous reporting). While our journalists agreed with news consumers' suggestions of using evidence indicators, they also suggested additional transparency indicators in areas such as the news reporting process and personal/organizational conflicts of interest. Prompted by our scenario, participants offered new design considerations for building trustworthy news platforms, such as designing for easy comprehension, presenting appropriate details in news articles (e.g., showing the number and nature of corrections made to an article), and comparing attributes across news organizations to highlight diverging practices. Comparing the responses from our two stakeholder groups reveals conflicting suggestions with trade-offs between them. Our study has implications for HCI designers in building trustworthy news systems.
HCAug 3, 2021
NudgeCred: Supporting News Credibility Assessment on Social Media Through NudgesMd Momen Bhuiyan, Michael Horning, Sang Won Lee et al.
Struggling to curb misinformation, social media platforms are experimenting with design interventions to enhance consumption of credible news on their platforms. Some of these interventions, such as the use of warning messages, are examples of nudges -- a choice-preserving technique to steer behavior. Despite their application, we do not know whether nudges could steer people into making conscious news credibility judgments online and if they do, under what constraints. To answer, we combine nudge techniques with heuristic based information processing to design NudgeCred -- a browser extension for Twitter. NudgeCred directs users' attention to two design cues: authority of a source and other users' collective opinion on a report by activating three design nudges -- Reliable, Questionable, and Unreliable, each denoting particular levels of credibility for news tweets. In a controlled experiment, we found that NudgeCred significantly helped users (n=430) distinguish news tweets' credibility, unrestricted by three behavioral confounds -- political ideology, political cynicism, and media skepticism. A five-day field deployment with twelve participants revealed that NudgeCred improved their recognition of news items and attention towards all of our nudges, particularly towards Questionable. Among other considerations, participants proposed that designers should incorporate heuristics that users' would trust. Our work informs nudge-based system design approaches for online media.
IVJul 22, 2020
Darwin's Neural Network: AI-based Strategies for Rapid and Scalable Cell and Coronavirus ScreeningSang Won Lee, Yueh-Ting Chiu, Philip Brudnicki et al.
Recent advances in the interdisciplinary scientific field of machine perception, computer vision, and biomedical engineering underpin a collection of machine learning algorithms with a remarkable ability to decipher the contents of microscope and nanoscope images. Machine learning algorithms are transforming the interpretation and analysis of microscope and nanoscope imaging data through use in conjunction with biological imaging modalities. These advances are enabling researchers to carry out real-time experiments that were previously thought to be computationally impossible. Here we adapt the theory of survival of the fittest in the field of computer vision and machine perception to introduce a new framework of multi-class instance segmentation deep learning, Darwin's Neural Network (DNN), to carry out morphometric analysis and classification of COVID19 and MERS-CoV collected in vivo and of multiple mammalian cell types in vitro.
HCJan 29, 2020
ScreenTrack: Using a Visual History of a Computer Screen to Retrieve Documents and Web PagesDonghan Hu, Sang Won Lee
Computers are used for various purposes, so frequent context switching is inevitable. In this setting, retrieving the documents, files, and web pages that have been used for a task can be a challenge. While modern applications provide a history of recent documents for users to resume work, this is not sufficient to retrieve all the digital resources relevant to a given primary document. The histories currently available do not take into account the complex dependencies among resources across applications. To address this problem, we tested the idea of using a visual history of a computer screen to retrieve digital resources within a few days of their use through the development of ScreenTrack. ScreenTrack is software that captures screenshots of a computer at regular intervals. It then generates a time-lapse video from the captured screenshots and lets users retrieve a recently opened document or web page from a screenshot after recognizing the resource by its appearance. A controlled user study found that participants were able to retrieve requested information more quickly with ScreenTrack than under the baseline condition with existing tools. A follow-up study showed that the participants used ScreenTrack to retrieve previously used resources and to recover the context for task resumption.
HCOct 6, 2019
Liveness in Interactive SystemsSang Won Lee
Creating an artifact in front of public offers an opportunity to involve spectators in the creation process. For example, in a live music concert, audience members can clap, stomp and sing with the musicians to be part of the music piece. Live creation can facilitate collaboration with the spectators. The questions I set out to answer are what does it mean to have liveness in interactive systems to support large-scale hybrid events that involve audience participation. The notion of liveness is subtle in human-computer interaction. In this paper, I revisit the notion of liveness and provide definitions of both live and liveness from the perspective of designing interactive systems. In addition, I discuss why liveness matters in facilitating hybrid events and suggest future research works
HCOct 6, 2019
Computer-mediated EmpathySang Won Lee
While novel social networks and emerging technologies help us transcend the spatial and temporal constraints inherent to in-person communication, the trade-off is a loss of natural expressivity. While empathetic interaction is already challenging in in-person communication, computer-mediated communication makes such empathetically rich communication even more difficult. Are technology and intelligent systems opportunities or threats to more empathic interpersonal communication? Realizing empathy is suggested not only as a way to communicate with others but also to design products for users and facilitate creativity. In this position paper, I suggest a framework to breakdown empathy, introduce each element, and show how computing, technologies, and algorithms can support (or hinder) certain elements of the empathy framework.
HCSep 6, 2016
Creating Interactive Behaviors in Early Sketch by Recording and Remixing Crowd DemonstrationsSang Won Lee, Yi Wei Yang, Shiyan Yan et al.
In the early stages of designing graphical user interfaces (GUIs), the look (appearance) can be easily presented by sketching, but the feel (interactive behaviors) cannot, and often requires an accompanying description of how it works (Myers et al. 2008). We propose to use crowdsourcing to augment early sketches with interactive behaviors generated, used, and reused by collective "wizards-of-oz" as opposed to a single wizard as in prior work (Davis et al. 2007). This demo presents an extension of Apparition (Lasecki et al. 2015), a crowd-powered prototyping tool that allows end users to create functional GUIs using speech and sketch. In Apparition, crowd workers collaborate in real-time on a shared canvas to refine the user-requested sketch interactively, and with the assistance of the end users. Our demo extends this functionality to let crowd workers "demonstrate" the canvas changes that are needed for a behavior and refine their demonstrations to improve the fidelity of interactive behaviors. The system then lets workers "remix" these behaviors to make creating future behaviors more efficient.