Kihoon Son

h-index8

6papers

58citations

Novelty43%

AI Score46

Ranked #37,456 of 194,257 authors (top 19%)#187 in HC (top 7%)

6 Papers

5.6HCMar 2

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

Kihoon Son, Hyewon Lee, DaEun Choi et al.

Human collaborators coordinate dynamically through process visibility and workspace awareness, yet AI agents typically either provide only final outputs or expose read-only execution processes (e.g., planning, reasoning) without interpreting concurrent user actions on shared artifacts. Building on mixed-initiative interaction principles, we explore whether agents can achieve collaborative context awareness -- interpreting concurrent user actions on shared artifacts and adapting in real-time. Study 1 (N=10 professional designers) revealed that process visibility enabled reasoning about agent actions but exposed conflicts when agents could not distinguish feedback from independent work. We developed CLEO, which interprets collaborative intent and adapts in real-time. Study 2 (N=10, two-day with stimulated recall interviews) analyzed 214 turns, identifying five action patterns, six triggers, and four enabling factors explaining when designers choose delegation (70.1%), direction (28.5%), or concurrent work (31.8%). We present a decision model with six interaction loops, design implications, and an annotated dataset.

9.7HCApr 13

Contexty: Capturing and Organizing In-situ Thoughts for Context-Aware AI Support

Yoonsu Kim, Chanbin Park, Kihoon Son et al.

During complex knowledge work, people engage in iterative sensemaking: interpreting information, connecting ideas, and refining their understanding. Yet in current human-AI collaboration, these cognitive processes are difficult to share and organize for AI. They arise in situ and are rarely captured without interrupting the task, and even when expressed, remain scattered or reduced to system-generated summaries that fail to reflect users' cognitive processes. We address this challenge by enabling AI context that is grounded in users' cognitive traces and can be directly inspected and revised by the user. We first explore this through a probe system that supports in-situ snippet memoing, allowing users to easily share their cognitive moves. Our study (N=10) highlights the value of capturing such context and the challenge of organizing it once accumulated. We then present Contexty, which supports users in inspecting and refining these contexts to better reflect their understanding of the task. Our evaluation (N=12) showed that Contexty improved task awareness, thought structuring, and users' sense of authorship and control, with participants preferring snippet-grounded AI responses over non-grounded ones (78.1%). We discuss how capturing and organizing users' cognitive context enables AI as a context-aware collaborator while preserving user agency.

11.5HCSep 18, 2025

ClearFairy: Capturing Creative Workflows through Decision Structuring, In-Situ Questioning, and Rationale Inference

Kihoon Son, DaEun Choi, Tae Soo Kim et al.

Capturing professionals' decision-making in creative workflows is essential for reflection, collaboration, and knowledge sharing, yet existing methods often leave rationales incomplete and implicit decisions hidden. To address this, we present CLEAR framework that structures reasoning into cognitive decision steps-linked units of actions, artifacts, and self-explanations that make decisions traceable. Building on this framework, we introduce ClearFairy, a think-aloud AI assistant for UI design that detects weak explanations, asks lightweight clarifying questions, and infers missing rationales to ease the knowledge-sharing burden. In a study with twelve creative professionals, 85% of ClearFairy's inferred rationales were accepted, increasing strong explanations from 14% to over 83% of decision steps without adding cognitive demand. The captured steps also enhanced generative AI agents in Figma, yielding next-action predictions better aligned with professionals and producing more coherent design outcomes. For future research on human knowledge-grounded creative AI agents, we release a dataset of captured 417 decision steps.

3.6CVNov 25, 2025

CANVAS: A Benchmark for Vision-Language Models on Tool-Based User Interface Design

Daeheon Jeong, Seoyeon Byun, Kihoon Son et al.

User interface (UI) design is an iterative process in which designers progressively refine their work with design software such as Figma or Sketch. Recent advances in vision language models (VLMs) with tool invocation suggest these models can operate design software to edit a UI design through iteration. Understanding and enhancing this capacity is important, as it highlights VLMs' potential to collaborate with designers within conventional software. However, as no existing benchmark evaluates tool-based design performance, the capacity remains unknown. To address this, we introduce CANVAS, a benchmark for VLMs on tool-based user interface design. Our benchmark contains 598 tool-based design tasks paired with ground-truth references sampled from 3.3K mobile UI designs across 30 function-based categories (e.g., onboarding, messaging). In each task, a VLM updates the design step-by-step through context-based tool invocations (e.g., create a rectangle as a button background), linked to design software. Specifically, CANVAS incorporates two task types: (i) design replication evaluates the ability to reproduce a whole UI screen; (ii) design modification evaluates the ability to modify a specific part of an existing screen. Results suggest that leading models exhibit more strategic tool invocations, improving design quality. Furthermore, we identify common error patterns models exhibit, guiding future work in enhancing tool-based design capabilities.

9.6HCMay 9, 2024

Beyond Prompts: Learning from Human Communication for Enhanced AI Intent Alignment

Yoonsu Kim, Kihoon Son, Seoyoung Kim et al.

AI intent alignment, ensuring that AI produces outcomes as intended by users, is a critical challenge in human-AI interaction. The emergence of generative AI, including LLMs, has intensified the significance of this problem, as interactions increasingly involve users specifying desired results for AI systems. In order to support better AI intent alignment, we aim to explore human strategies for intent specification in human-human communication. By studying and comparing human-human and human-LLM communication, we identify key strategies that can be applied to the design of AI systems that are more effective at understanding and aligning with user intent. This study aims to advance toward a human-centered AI system by bringing together human communication strategies for the design of AI systems.

17.6HCMay 9, 2024

One vs. Many: Comprehending Accurate Information from Multiple Erroneous and Inconsistent AI Generations

Yoonjoo Lee, Kihoon Son, Tae Soo Kim et al.

As Large Language Models (LLMs) are nondeterministic, the same input can generate different outputs, some of which may be incorrect or hallucinated. If run again, the LLM may correct itself and produce the correct answer. Unfortunately, most LLM-powered systems resort to single results which, correct or not, users accept. Having the LLM produce multiple outputs may help identify disagreements or alternatives. However, it is not obvious how the user will interpret conflicts or inconsistencies. To this end, we investigate how users perceive the AI model and comprehend the generated information when they receive multiple, potentially inconsistent, outputs. Through a preliminary study, we identified five types of output inconsistencies. Based on these categories, we conducted a study (N=252) in which participants were given one or more LLM-generated passages to an information-seeking question. We found that inconsistency within multiple LLM-generated outputs lowered the participants' perceived AI capacity, while also increasing their comprehension of the given information. Specifically, we observed that this positive effect of inconsistencies was most significant for participants who read two passages, compared to those who read three. Based on these findings, we present design implications that, instead of regarding LLM output inconsistencies as a drawback, we can reveal the potential inconsistencies to transparently indicate the limitations of these models and promote critical LLM usage.