Daniel Buschek

h-index54

36papers

2,632citations

Novelty34%

AI Score53

Ranked #32,142 of 201,326 authors (top 16%)#104 in HC (top 4%)

36 Papers

HCFeb 1, 2023

Co-Writing with Opinionated Language Models Affects Users' Views

Maurice Jakesch, Advait Bhat, Daniel Buschek et al. · microsoft-research

If large language models like GPT-3 preferably produce a particular point of view, they may influence people's opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write - and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants' writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully.

HCSep 3, 2022

How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models

Hai Dang, Lukas Mecke, Florian Lehmann et al.

Deep generative models have the potential to fundamentally change the way we create high-fidelity digital content but are often hard to control. Prompting a generative model is a promising recent development that in principle enables end-users to creatively leverage zero-shot and few-shot learning to assign new tasks to an AI ad-hoc, simply by writing them down. However, for the majority of end-users writing effective prompts is currently largely a trial and error process. To address this, we discuss the key opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction. Based on our analysis, we propose four design goals for user interfaces that support prompting. We illustrate these with concrete UI design sketches, focusing on the use case of creative writing. The research community in HCI and AI can take these as starting points to develop adequate user interfaces for models capable of zero- and few-shot learning.

HCAug 19, 2022

Beyond Text Generation: Supporting Writers with Continuous Automatic Text Summaries

Hai Dang, Karim Benharrak, Florian Lehmann et al.

We propose a text editor to help users plan, structure and reflect on their writing process. It provides continuously updated paragraph-wise summaries as margin annotations, using automatic text summarization. Summary levels range from full text, to selected (central) sentences, down to a collection of keywords. To understand how users interact with this system during writing, we conducted two user studies (N=4 and N=8) in which people wrote analytic essays about a given topic and article. As a key finding, the summaries gave users an external perspective on their writing and helped them to revise the content and scope of their drafted paragraphs. People further used the tool to quickly gain an overview of the text and developed strategies to integrate insights from the automated summaries. More broadly, this work explores and highlights the value of designing AI tools for writers, with Natural Language Processing (NLP) capabilities that go beyond direct text generation and correction.

HCMar 6, 2023

Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting

Hai Dang, Sven Goller, Florian Lehmann et al.

We propose a conceptual perspective on prompts for Large Language Models (LLMs) that distinguishes between (1) diegetic prompts (part of the narrative, e.g. "Once upon a time, I saw a fox..."), and (2) non-diegetic prompts (external, e.g. "Write about the adventures of the fox."). With this lens, we study how 129 crowd workers on Prolific write short texts with different user interfaces (1 vs 3 suggestions, with/out non-diegetic prompts; implemented with GPT-3): When the interface offered multiple suggestions and provided an option for non-diegetic prompting, participants preferred choosing from multiple suggestions over controlling them via non-diegetic prompts. When participants provided non-diegetic prompts it was to ask for inspiration, topics or facts. Single suggestions in particular were guided both with diegetic and non-diegetic information. This work informs human-AI interaction with generative models by revealing that (1) writing non-diegetic prompts requires effort, (2) people combine diegetic and non-diegetic prompting, and (3) they use their draft (i.e. diegetic information) and suggestion timing to strategically guide LLMs.

HCSep 19, 2023

Writer-Defined AI Personas for On-Demand Feedback Generation

Karim Benharrak, Tim Zindulka, Florian Lehmann et al.

Compelling writing is tailored to its audience. This is challenging, as writers may struggle to empathize with readers, get feedback in time, or gain access to the target group. We propose a concept that generates on-demand feedback, based on writer-defined AI personas of any target audience. We explore this concept with a prototype (using GPT-3.5) in two user studies (N=5 and N=11): Writers appreciated the concept and strategically used personas for getting different perspectives. The feedback was seen as helpful and inspired revisions of text and personas, although it was often verbose and unspecific. We discuss the impact of on-demand feedback, the limited representativity of contemporary AI systems, and further ideas for defining AI personas. This work contributes to the vision of supporting writers with AI by expanding the socio-technical perspective in AI tool design: To empower creators, we also need to keep in mind their relationship to an audience.

HCMar 6, 2023

The AI Ghostwriter Effect: When Users Do Not Perceive Ownership of AI-Generated Text But Self-Declare as Authors

Fiona Draxler, Anna Werner, Florian Lehmann et al.

Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. Personalization of AI-generated texts did not impact the AI Ghostwriter Effect, and higher levels of participants' influence on texts increased their sense of ownership. Participants were more likely to attribute ownership to supposedly human ghostwriters than AI ghostwriters, resulting in a higher ownership-authorship discrepancy for human ghostwriters. Rationalizations for authorship in AI ghostwriters and human ghostwriters were similar. We discuss how our findings relate to psychological ownership and human-AI interaction to lay the foundations for adapting authorship frameworks and user interfaces in AI in text-generation tasks.

HCAug 1, 2022

Suggestion Lists vs. Continuous Generation: Interaction Design for Writing with Generative Models on Mobile Devices Affect Text Length, Wording and Perceived Authorship

Florian Lehmann, Niklas Markert, Hai Dang et al.

Neural language models have the potential to support human writing. However, questions remain on their integration and influence on writing and output. To address this, we designed and compared two user interfaces for writing with AI on mobile devices, which manipulate levels of initiative and control: 1) Writing with continuously generated text, the AI adds text word-by-word and user steers. 2) Writing with suggestions, the AI suggests phrases and user selects from a list. In a supervised online study (N=18), participants used these prototypes and a baseline without AI. We collected touch interactions, ratings on inspiration and authorship, and interview data. With AI suggestions, people wrote less actively, yet felt they were the author. Continuously generated text reduced this perceived authorship, yet increased editing behavior. In both designs, AI increased text length and was perceived to influence wording. Our findings add new empirical evidence on the impact of UI design decisions on user experience and output with co-creative systems.

HCMay 25

Explaining Too Much? Understanding How Large Language Model Reasoning Traces Influence Performance and Metacognition

Daniela Fernandes, Daniel Buschek, Lev Tankelevitch et al.

Large Language Model interfaces are increasingly verbose, exposing intermediate reasoning traces alongside final answers. Traces are framed as transparency mechanisms, yet it is unclear how people use them to solve problems. We report a preregistered between-subjects study (N = 559) in which participants solved ten LSAT-style reasoning problems under one of three conditions: an Answer-only baseline, a Full-trace revealed before the answer, and a Summary-trace presented alongside the answer. Summaries preserved task performance at the no-trace baseline while significantly elevating trust and hedonic appeal, establishing that trace exposure shifts subjective appraisal of the interaction without bringing performance benefits. Under an open-weight reasoning model exposing verbose intermediate output, full traces additionally impaired performance relative to the answer-only baseline. Across all conditions, participants substantially overestimated their performance, and no trace format supported calibrated self-evaluation. Further analysis indicates that hedonic appeal, not trust, carries the indirect path to overestimation, consistent with a processing-fluency account. Reasoning traces are best understood as user-facing interface artifacts rather than transparent windows into model cognition, and calibration is unlikely to emerge from the traces themselves and may best be scaffolded by interactions that elicit users' own reasoning first.

HCSep 19, 2024

Exploring the Lands Between: A Method for Finding Differences between AI-Decisions and Human Ratings through Generated Samples

Lukas Mecke, Daniel Buschek, Uwe Gruenefeld et al.

Many important decisions in our everyday lives, such as authentication via biometric models, are made by Artificial Intelligence (AI) systems. These can be in poor alignment with human expectations, and testing them on clear-cut existing data may not be enough to uncover those cases. We propose a method to find samples in the latent space of a generative model, designed to be challenging for a decision-making model with regard to matching human expectations. By presenting those samples to both the decision-making model and human raters, we can identify areas where its decisions align with human intuition and where they contradict it. We apply this method to a face recognition model and collect a dataset of 11,200 human ratings from 100 participants. We discuss findings from our dataset and how our approach can be used to explore the performance of AI models in different contexts and for different user groups.

HCMay 15

Conversations in Space: Structuring Non-Linear LLM Interactions on a Canvas

Rifat Mehreen Amin, Alperen Adatepe, Daniela Fernandes et al.

Conversational interfaces powered by large language models (LLMs) are widely used for ideation and analysis, yet their linear structure limits exploration of alternatives and management of long-running interactions. We present CanvasConvo, a conversational interface concept that transforms linear chat into a branching conversation tree embedded in a spatial canvas. CanvasConvo enables users to explore what-if scenarios by branching directly from conversational content, supporting parallel development of alternative directions. These branches are visualized on a canvas while remaining integrated with a familiar chat interface, allowing users to switch between linear and non-linear interaction. Features such as timeline-based navigation, automatic tagging and summarization, and context-aware controls (e.g., goals, reusable prompts) support structured interaction and continuity. We evaluated CanvasConvo in a 5-7 day field study with 24 participants. Our findings highlight how non-linear conversational structures support exploratory workflows and different interactions in LLM-based work.

HCMar 21, 2024

A Design Space for Intelligent and Interactive Writing Assistants

Mina Lee, Katy Ilonka Gero, John Joon Young Chung et al. · allen-ai, deepmind

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.

HCFeb 4, 2022Code

SummaryLens -- A Smartphone App for Exploring Interactive Use of Automated Text Summarization in Everyday Life

Karim Benharrak, Florian Lehmann, Hai Dang et al.

We present SummaryLens, a concept and prototype for a mobile tool that leverages automated text summarization to enable users to quickly scan and summarize physical text documents. We further combine this with a text-to-speech system to read out the summary on demand. With this concept, we propose and explore a concrete application case of bringing ongoing progress in AI and Natural Language Processing to a broad audience with interactive use cases in everyday life. Based on our implemented features, we describe a set of potential usage scenarios and benefits, including support for low-vision, low-literate and dyslexic users. A first usability study shows that the interactive use of automated text summarization in everyday life has noteworthy potential. We make the prototype available as an open-source project to facilitate further research on such tools.

HCMar 16, 2025

CorpusStudio: Surfacing Emergent Patterns in a Corpus of Prior Work while Writing

Hai Dang, Chelse Swoopes, Daniel Buschek et al. · harvard

Many communities, including the scientific community, develop implicit writing norms. Understanding them is crucial for effective communication with that community. Writers gradually develop an implicit understanding of norms by reading papers and receiving feedback on their writing. However, it is difficult to both externalize this knowledge and apply it to one's own writing. We propose two new writing support concepts that reify document and sentence-level patterns in a given text corpus: (1) an ordered distribution over section titles and (2) given the user's draft and cursor location, many retrieved contextually relevant sentences. Recurring words in the latter are algorithmically highlighted to help users see any emergent norms. Study results (N=16) show that participants revised the structure and content using these concepts, gaining confidence in aligning with or breaking norms after reviewing many examples. These results demonstrate the value of reifying distributions over other authors' writing choices during the writing process.

HCFeb 11, 2025

Exploring Mobile Touch Interaction with Large Language Models

Tim Zindulka, Jannek Sekowski, Florian Lehmann et al.

Interacting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.

HCFeb 10, 2025

Content-Driven Local Response: Supporting Sentence-Level and Message-Level Mobile Email Replies With and Without AI

Tim Zindulka, Sven Goller, Florian Lehmann et al.

Mobile emailing demands efficiency in diverse situations, which motivates the use of AI. However, generated text does not always reflect how people want to respond. This challenges users with AI involvement tradeoffs not yet considered in email UIs. We address this with a new UI concept called Content-Driven Local Response (CDLR), inspired by microtasking. This allows users to insert responses into the email by selecting sentences, which additionally serves to guide AI suggestions. The concept supports combining AI for local suggestions and message-level improvements. Our user study (N=126) compared CDLR with manual typing and full reply generation. We found that CDLR supports flexible workflows with varying degrees of AI involvement, while retaining the benefits of reduced typing and errors. This work contributes a new approach to integrating AI capabilities: By redesigning the UI for workflows with and without AI, we can empower users to dynamically adjust AI involvement.

HCMar 27, 2025

Composable Prompting Workspaces for Creative Writing: Exploration and Iteration Using Dynamic Widgets

Rifat Mehreen Amin, Oliver Hans Kühle, Daniel Buschek et al.

Generative AI models offer many possibilities for text creation and transformation. Current graphical user interfaces (GUIs) for prompting them lack support for iterative exploration, as they do not represent prompts as actionable interface objects. We propose the concept of a composable prompting canvas for text exploration and iteration using dynamic widgets. Users generate widgets through system suggestions, prompting, or manually to capture task-relevant facets that affect the generated text. In a comparative study with a baseline (conversational UI), 18 participants worked on two writing tasks, creating diverse prompting environments with custom widgets and spatial layouts. They reported having more control over the generated text and preferred our system over the baseline. Our design significantly outperformed the baseline on the Creativity Support Index, and participants felt the results were worth the effort. This work highlights the need for GUIs that support user-driven customization and (re-)structuring to increase both the flexibility and efficiency of prompting.

HCApr 14, 2024

Deceptive Patterns of Intelligent and Interactive Writing Assistants

Karim Benharrak, Tim Zindulka, Daniel Buschek

Large Language Models have become an integral part of new intelligent and interactive writing assistants. Many are offered commercially with a chatbot-like UI, such as ChatGPT, and provide little information about their inner workings. This makes this new type of widespread system a potential target for deceptive design patterns. For example, such assistants might exploit hidden costs by providing guidance up until a certain point before asking for a fee to see the rest. As another example, they might sneak unwanted content/edits into longer generated or revised text pieces (e.g. to influence the expressed opinion). With these and other examples, we conceptually transfer several deceptive patterns from the literature to the new context of AI writing assistants. Our goal is to raise awareness and encourage future research into how the UI and interaction design of such systems can impact people and their writing.

HCSep 15, 2025

The AI Memory Gap: Users Misremember What They Created With AI or Without

Tim Zindulka, Sven Goller, Daniela Fernandes et al.

As large language models (LLMs) become embedded in interactive text generation, disclosure of AI as a source depends on people remembering which ideas or texts came from themselves and which were created with AI. We investigate how accurately people remember the source of content when using AI. In a pre-registered experiment, 184 participants generated and elaborated on ideas both unaided and with an LLM-based chatbot. One week later, they were asked to identify the source (noAI vs withAI) of these ideas and texts. Our findings reveal a significant gap in memory: After AI use, the odds of correct attribution dropped, with the steepest decline in mixed human-AI workflows, where either the idea or elaboration was created with AI. We validated our results using a computational model of source memory. Discussing broader implications, we highlight the importance of considering source confusion in the design and use of interactive text generation technologies.

HCSep 15, 2025

Collaborative Document Editing with Multiple Users and AI Agents

Florian Lehmann, Krystsina Shauchenka, Daniel Buschek

Current AI writing support tools are largely designed for individuals, complicating collaboration when co-writers must leave the shared workspace to use AI and then communicate and reintegrate results. We propose integrating AI agents directly into collaborative writing environments. Our prototype makes AI use transparent and customisable through two new shared objects: agent profiles and tasks. Agent responses appear in the familiar comment feature. In a user study (N=30), 14 teams worked on writing projects during one week. Interaction logs and interviews show that teams incorporated agents into existing norms of authorship, control, and coordination, rather than treating them as team members. Agent profiles were viewed as personal territory, while created agents and outputs became shared resources. We discuss implications for team-based AI interaction, highlighting opportunities and boundaries for treating AI as a shared resource in collaborative work.

HCJun 4, 2025

PromptCanvas: Composable Prompting Workspaces Using Dynamic Widgets for Exploration and Iteration in Creative Writing

Rifat Mehreen Amin, Oliver Hans Kühle, Daniel Buschek et al.

We introduce PromptCanvas, a concept that transforms prompting into a composable, widget-based experience on an infinite canvas. Users can generate, customize, and arrange interactive widgets representing various facets of their text, offering greater control over AI-generated content. PromptCanvas allows widget creation through system suggestions, user prompts, or manual input, providing a flexible environment tailored to individual needs. This enables deeper engagement with the creative process. In a lab study with 18 participants, PromptCanvas outperformed a traditional conversational UI on the Creativity Support Index. Participants found that it reduced cognitive load, with lower mental demand and frustration. Qualitative feedback revealed that the visual organization of thoughts and easy iteration encouraged new perspectives and ideas. A follow-up field study (N=10) confirmed these results, showcasing the potential of dynamic, customizable interfaces in improving collaborative writing with AI.

HCFeb 2, 2022

GANSlider: How Users Control Generative Models for Images using Multiple Sliders with and without Feedforward Information

Hai Dang, Lukas Mecke, Daniel Buschek

We investigate how multiple sliders with and without feedforward visualizations influence users' control of generative models. In an online study (N=138), we collected a dataset of people interacting with a generative adversarial network (StyleGAN2) in an image reconstruction task. We found that more control dimensions (sliders) significantly increase task difficulty and user actions. Visual feedforward partly mitigates this by enabling more goal-directed interaction. However, we found no evidence of faster or more accurate task performance. This indicates a tradeoff between feedforward detail and implied cognitive costs, such as attention. Moreover, we found that visualizations alone are not always sufficient for users to understand individual control dimensions. Our study quantifies fundamental UI design factors and resulting interaction behavior in this context, revealing opportunities for improvement in the UI design for interactive applications of generative models. We close by discussing design directions and further aspects.

HCJan 18, 2022

Examining Autocompletion as a Basic Concept for Interaction with Generative AI

Florian Lehmann, Daniel Buschek

Autocompletion is an approach that extends and continues partial user input. We propose to interpret autocompletion as a basic interaction concept in human-AI interaction. We first describe the concept of autocompletion and dissect its user interface and interaction elements, using the well-established textual autocompletion in search engines as an example. We then highlight how these elements reoccur in other application domains, such as code completion, GUI sketching, and layouting. This comparison and transfer highlights an inherent role of such intelligent systems to extend and complete user input, in particular useful for designing interactions with and for generative AI. We reflect on and discuss our conceptual analysis of autocompletion to provide inspiration and a conceptual lens on current challenges in designing for human-AI interaction.

HCSep 16, 2021

Comparing Concepts for Embedding Second-Language Vocabulary Acquisition into Everyday Smartphone Interactions

Christina Schneegass, Sophia Sigethy, Malin Eiband et al.

We present a three-week within-subject field study comparing three mobile language learning (MLL) applications with varying levels of integration into everyday smartphone interactions: We designed a novel (1) UnlockApp that presents a vocabulary task with each authentication event, nudging users towards short frequent learning sessions. We compare it with a (2) NotificationApp that displays vocabulary tasks in a push notification in the status bar, which is always visible but learning needs to be user-initiated, and a (3) StandardApp that requires users to start in-app learning actively. Our study is the first to directly compare these embedding concepts for MLL, showing that integrating vocabulary learning into everyday smartphone interactions via UnlockApp and NotificationApp increases the number of answers. However, users show individual subjective preferences. Based on our results, we discuss the trade-off between higher content exposure and disturbance, and the related challenges and opportunities of embedding learning seamlessly into everyday mobile interactions.

HCJun 23, 2021

CharacterChat: Supporting the Creation of Fictional Characters through Conversation and Progressive Manifestation with a Chatbot

Oliver Schmitt, Daniel Buschek

We present CharacterChat, a concept and chatbot to support writers in creating fictional characters. Concretely, writers progressively turn the bot into their imagined character through conversation. We iteratively developed CharacterChat in a user-centred approach, starting with a survey on character creation with writers (N=30), followed by two qualitative user studies (N=7 and N=8). Our prototype combines two modes: (1) Guided prompts help writers define character attributes (e.g. User: "Your name is Jane."), including suggestions for attributes (e.g. Bot: "What is my main motivation?") and values, realised as a rule-based system with a concept network. (2) Open conversation with the chatbot helps writers explore their character and get inspiration, realised with a language model that takes into account the defined character attributes. Our user studies reveal benefits particularly for early stages of character creation, and challenges due to limited conversational capabilities. We conclude with lessons learned and ideas for future work.

HCApr 1, 2021

Nine Potential Pitfalls when Designing Human-AI Co-Creative Systems

Daniel Buschek, Lukas Mecke, Florian Lehmann et al.

This position paper examines potential pitfalls on the way towards achieving human-AI co-creation with generative models in a way that is beneficial to the users' interests. In particular, we collected a set of nine potential pitfalls, based on the literature and our own experiences as researchers working at the intersection of HCI and AI. We illustrate each pitfall with examples and suggest ideas for addressing it. Reflecting on all pitfalls, we discuss and conclude with implications for future research directions. With this collection, we hope to contribute to a critical and constructive discussion on the roles of humans and AI in co-creative interactions, with an eye on related assumptions and potential side-effects for creative practices and beyond.

HCMar 8, 2021

Modeling Web Browsing Behavior across Tabs and Websites with Tracking and Prediction on the Client Side

Changkun Ou, Daniel Buschek, Malin Eiband et al.

Clickstreams on individual websites have been studied for decades to gain insights into user interests and to improve website experiences. This paper proposes and examines a novel sequence modeling approach for web clickstreams, that also considers multi-tab branching and backtracking actions across websites to capture the full action sequence of a user while browsing. All of this is done using machine learning on the client side to obtain a more comprehensive view and at the same time preserve privacy. We evaluate our formalism with a model trained on data collected in a user study with three different browsing tasks based on different human information seeking strategies from psychological literature. Our results show that the model can successfully distinguish between browsing behaviors and correctly predict future actions. A subsequent qualitative analysis identified five common web browsing patterns from our collected behavior data, which help to interpret the model. More generally, this illustrates the power of overparameterization in ML and offers a new way of modeling, reasoning with, and prediction of observable sequential human interaction behaviors.

HCMar 1, 2021

GestureMap: Supporting Visual Analytics and Quantitative Analysis of Motion Elicitation Data by Learning 2D Embeddings

Hai Dang, Daniel Buschek

This paper presents GestureMap, a visual analytics tool for gesture elicitation which directly visualises the space of gestures. Concretely, a Variational Autoencoder embeds gestures recorded as 3D skeletons on an interactive 2D map. GestureMap further integrates three computational capabilities to connect exploration to quantitative measures: Leveraging DTW Barycenter Averaging (DBA), we compute average gestures to 1) represent gesture groups at a glance; 2) compute a new consensus measure (variance around average gesture); and 3) cluster gestures with k-means. We evaluate GestureMap and its concepts with eight experts and an in-depth analysis of published data. Our findings show how GestureMap facilitates exploring large datasets and helps researchers to gain a visual understanding of elicited gesture spaces. It further opens new directions, such as comparing elicitations across studies. We discuss implications for elicitation studies and research, and opportunities to extend our approach to additional tasks in gesture elicitation.

HCFeb 26, 2021

Eliciting and Analysing Users' Envisioned Dialogues with Perfect Voice Assistants

Sarah Theres Völkel, Daniel Buschek, Malin Eiband et al.

We present a dialogue elicitation study to assess how users envision conversations with a perfect voice assistant (VA). In an online survey, N=205 participants were prompted with everyday scenarios, and wrote the lines of both user and VA in dialogues that they imagined as perfect. We analysed the dialogues with text analytics and qualitative analysis, including number of words and turns, social aspects of conversation, implied VA capabilities, and the influence of user personality. The majority envisioned dialogues with a VA that is interactive and not purely functional; it is smart, proactive, and has knowledge about the user. Attitudes diverged regarding the assistant's role as well as it expressing humour and opinions. An exploratory analysis suggested a relationship with personality for these aspects, but correlations were low overall. We discuss implications for research and design of future VAs, underlining the vision of enabling conversational UIs, rather than single command "Q&As".

CLFeb 26, 2021

Methods for the Design and Evaluation of HCI+NLP Systems

Hendrik Heuer, Daniel Buschek

HCI and NLP traditionally focus on different evaluation methods. While HCI involves a small number of people directly and deeply, NLP traditionally relies on standardized benchmark evaluations that involve a larger number of people indirectly. We present five methodological proposals at the intersection of HCI and NLP and situate them in the context of ML-based NLP models. Our goal is to foster interdisciplinary collaboration and progress in both fields by emphasizing what the fields can learn from each other.

HCJan 22, 2021

The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers

Daniel Buschek, Martin Zürn, Malin Eiband

We present an in-depth analysis of the impact of multi-word suggestion choices from a neural language model on user behaviour regarding input and text composition in email writing. Our study for the first time compares different numbers of parallel suggestions, and use by native and non-native English writers, to explore a trade-off of "efficiency vs ideation", emerging from recent literature. We built a text editor prototype with a neural language model (GPT-2), refined in a prestudy with 30 people. In an online study (N=156), people composed emails in four conditions (0/1/3/6 parallel suggestions). Our results reveal (1) benefits for ideation, and costs for efficiency, when suggesting multiple phrases; (2) that non-native speakers benefit more from more suggestions; and (3) further insights into behaviour patterns. We discuss implications for research, the design of interactive suggestion systems, and the vision of supporting writers with AI instead of replacing them.

HCMar 13, 2020

Developing a Personality Model for Speech-based Conversational Agents Using the Psycholexical Approach

Sarah Theres Völkel, Ramona Schödel, Daniel Buschek et al.

We present the first systematic analysis of personality dimensions developed specifically to describe the personality of speech-based conversational agents. Following the psycholexical approach from psychology, we first report on a new multi-method approach to collect potentially descriptive adjectives from 1) a free description task in an online survey (228 unique descriptors), 2) an interaction task in the lab (176 unique descriptors), and 3) a text analysis of 30,000 online reviews of conversational agents (Alexa, Google Assistant, Cortana) (383 unique descriptors). We aggregate the results into a set of 349 adjectives, which are then rated by 744 people in an online survey. A factor analysis reveals that the commonly used Big Five model for human personality does not adequately describe agent personality. As an initial step to developing a personality model, we propose alternative dimensions and discuss implications for the design of agent personalities, personality-aware personalisation, and future research.

HCMar 6, 2020

Heartbeats in the Wild: A Field Study Exploring ECG Biometrics in Everyday Life

Florian Lehmann, Daniel Buschek

This paper reports on an in-depth study of electrocardiogram (ECG) biometrics in everyday life. We collected ECG data from 20 people over a week, using a non-medical chest tracker. We evaluated user identification accuracy in several scenarios and observed equal error rates of 9.15% to 21.91%, heavily depending on 1) the number of days used for training, and 2) the number of heartbeats used per identification decision. We conclude that ECG biometrics can work in the wild but are less robust than expected based on the literature, highlighting that previous lab studies obtained highly optimistic results with regard to real life deployments. We explain this with noise due to changing body postures and states as well as interrupted measures. We conclude with implications for future research and the design of ECG biometrics systems for real world deployments, including critical reflections on privacy.

HCMar 6, 2020

What is "Intelligent" in Intelligent User Interfaces? A Meta-Analysis of 25 Years of IUI

Sarah Theres Völkel, Christina Schneegass, Malin Eiband et al.

This reflection paper takes the 25th IUI conference milestone as an opportunity to analyse in detail the understanding of intelligence in the community: Despite the focus on intelligent UIs, it has remained elusive what exactly renders an interactive system or user interface "intelligent", also in the fields of HCI and AI at large. We follow a bottom-up approach to analyse the emergent meaning of intelligence in the IUI community: In particular, we apply text analysis to extract all occurrences of "intelligent" in all IUI proceedings. We manually review these with regard to three main questions: 1) What is deemed intelligent? 2) How (else) is it characterised? and 3) What capabilities are attributed to an intelligent entity? We discuss the community's emerging implicit perspective on characteristics of intelligence in intelligent user interfaces and conclude with ideas for stating one's own understanding of intelligence more explicitly.

HCFeb 4, 2020

A Method and Analysis to Elicit User-reported Problems in Intelligent Everyday Applications

Malin Eiband, Sarah Theres Völkel, Daniel Buschek et al.

The complex nature of intelligent systems motivates work on supporting users during interaction, for example through explanations. However, as of yet, there is little empirical evidence in regard to specific problems users face when applying such systems in everyday situations. This paper contributes a novel method and analysis to investigate such problems as reported by users: We analysed 45,448 reviews of four apps on the Google Play Store (Facebook, Netflix, Google Maps and Google Assistant) with sentiment analysis and topic modelling to reveal problems during interaction that can be attributed to the apps' algorithmic decision-making. We enriched this data with users' coping and support strategies through a follow-up online survey (N=286). In particular, we found problems and strategies related to content, algorithm, user choice, and feedback. We discuss corresponding implications for designing user support, highlighting the importance of user control and explanations of output, rather than processes.

HCJan 22, 2020

How to Support Users in Understanding Intelligent Systems? Structuring the Discussion

Malin Eiband, Daniel Buschek, Heinrich Hussmann

The opaque nature of many intelligent systems violates established usability principles and thus presents a challenge for human-computer interaction. Research in the field therefore highlights the need for transparency, scrutability, intelligibility, interpretability and explainability, among others. While all of these terms carry a vision of supporting users in understanding intelligent systems, the underlying notions and assumptions about users and their interaction with the system often remain unclear. We review the literature in HCI through the lens of implied user questions to synthesise a conceptual framework integrating user mindsets, user involvement, and knowledge outcomes to reveal, differentiate and classify current notions in prior work. This framework aims to resolve conceptual ambiguity in the field and enables researchers to clarify their assumptions and become aware of those made in prior work. We thus hope to advance and structure the dialogue in the HCI research community on supporting users in understanding intelligent systems.

CVSep 21, 2017

Neural network identification of people hidden from view with a single-pixel, single-photon detector

Piergiorgio Caramazza, Alessandro Boccolini, Daniel Buschek et al.

Light scattered from multiple surfaces can be used to retrieve information of hidden environments. However, full three-dimensional retrieval of an object hidden from view by a wall has only been achieved with scanning systems and requires intensive computational processing of the retrieved data. Here we use a non-scanning, single-photon single-pixel detector in combination with an artificial neural network: this allows us to locate the position and to also simultaneously provide the actual identity of a hidden person, chosen from a database of people (N=3). Artificial neural networks applied to specific computational imaging problems can therefore enable novel imaging capabilities with hugely simplified hardware and processing times