Katy Ilonka Gero

HC
h-index54
8papers
1,006citations
Novelty31%
AI Score37

8 Papers

HCMar 21, 2024
A Design Space for Intelligent and Interactive Writing Assistants

Mina Lee, Katy Ilonka Gero, John Joon Young Chung et al. · allen-ai, deepmind

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.

HCApr 13
From Planning to Revision: How AI Writing Support at Different Stages Alters Ownership

Katy Ilonka Gero, Tao Long, Carly Schnitzler et al.

Although AI assistance can improve writing quality, it can also decrease feelings of ownership. Ownership in writing has important implications for attribution, rights, norms, and cognitive engagement, and designers of AI support systems may want to consider how system features may impact ownership. We investigate how the stage at which AI support for writing is provided (planning, drafting, or revising) changes ownership. In a study of short essay writing (between subjects, n = 253) we find that while any AI assistance decreased ownership, planning support only minimally decreased ownership, while drafting support saw the largest decrease. This variation maps onto the amount of text and ideas contributed by AI, where more text and ideas from AI decreased ownership. Notably, an AI-generated draft based on participants' own outline resulted in significantly more AI-contributed ideas than AI support for planning. At the same time, more AI contributions improved essay quality. We propose that writers, educators, and designers consider writing stage when introducing AI assistance.

HCFeb 15, 2024
Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI Workflow

Tao Long, Katy Ilonka Gero, Lydia B. Chilton

Generative AI brings novel and impressive abilities to help people in everyday tasks. There are many AI workflows that solve real and complex problems by chaining AI outputs together with human interaction. Although there is an undeniable lure of AI, it is uncertain how useful generative AI workflows are after the novelty wears off. Additionally, workflows built with generative AI have the potential to be easily customized to fit users' individual needs, but do users take advantage of this? We conducted a three-week longitudinal study with 12 users to understand the familiarization and customization of generative AI tools for science communication. Our study revealed that there exists a familiarization phase, during which users were exploring the novel capabilities of the workflow and discovering which aspects they found useful. After this phase, users understood the workflow and were able to anticipate the outputs. Surprisingly, after familiarization the perceived utility of the system was rated higher than before, indicating that the perceived utility of AI is not just a novelty effect. The increase in benefits mainly comes from end-users' ability to customize prompts, and thus potentially appropriate the system to their own needs. This points to a future where generative AI systems can allow us to design for appropriation.

HCJan 24, 2024
Supporting Sensemaking of Large Language Model Outputs at Scale

Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu et al.

Large language models (LLMs) are capable of generating multiple responses to a single prompt, yet little effort has been expended to help end-users or system designers make use of this capability. In this paper, we explore how to present many LLM responses at once. We design five features, which include both pre-existing and novel methods for computing similarities and differences across textual documents, as well as how to render their outputs. We report on a controlled user study (n=24) and eight case studies evaluating these features and how they support users in different tasks. We find that the features support a wide variety of sensemaking tasks and even make tasks previously considered to be too difficult by our participants now tractable. Finally, we present design guidelines to inform future explorations of new LLM interfaces.

HCMay 20, 2023
Tweetorial Hooks: Generative AI Tools to Motivate Science on Social Media

Tao Long, Dorothy Zhang, Grace Li et al.

Communicating science and technology is essential for the public to understand and engage in a rapidly changing world. Tweetorials are an emerging phenomenon where experts explain STEM topics on social media in creative and engaging ways. However, STEM experts struggle to write an engaging "hook" in the first tweet that captures the reader's attention. We propose methods to use large language models (LLMs) to help users scaffold their process of writing a relatable hook for complex scientific topics. We demonstrate that LLMs can help writers find everyday experiences that are relatable and interesting to the public, avoid jargon, and spark curiosity. Our evaluation shows that the system reduces cognitive load and helps people write better hooks. Lastly, we discuss the importance of interactivity with LLMs to preserve the correctness, effectiveness, and authenticity of the writing.

HCDec 22, 2021
Eliciting Gestures for Novel Note-taking Interactions

Katy Ilonka Gero, Lydia B. Chilton, Chris Melancon et al.

Handwriting recognition is improving in leaps and bounds, and this opens up new opportunities for stylus-based interactions. In particular, note-taking applications can become a more intelligent user interface, incorporating new features like autocomplete and integrated search. In this work we ran a gesture elicitation study, asking 21 participants to imagine how they would interact with an imaginary, intelligent note-taking application. We report agreement on the elicited gestures, finding that while existing common interactions are prevalent (like double taps and long presses) a number of more novel interactions (like dragging selected items to hotspots or using annotations) were also well-represented. We discuss the mental models participants drew on when explaining their gestures and what kind of feedback users might need to move to more stylus-centric interactions.

CLOct 22, 2021
Lightweight Decoding Strategies for Increasing Specificity

Katy Ilonka Gero, Chris Kedzie, Savvas Petridis et al.

Language models are known to produce vague and generic outputs. We propose two unsupervised decoding strategies based on either word-frequency or point-wise mutual information to increase the specificity of any model that outputs a probability distribution over its vocabulary at generation time. We test the strategies in a prompt completion task; with human evaluations, we find that both strategies increase the specificity of outputs with only modest decreases in sensibility. We also briefly present a summarization use case, where these strategies can produce more specific summaries.

HCOct 14, 2021
Sparks: Inspiration for Science Writing using Language Models

Katy Ilonka Gero, Vivian Liu, Lydia B. Chilton

Large-scale language models are rapidly improving, performing well on a wide variety of tasks with little to no customization. In this work we investigate how language models can support science writing, a challenging writing task that is both open-ended and highly constrained. We present a system for generating "sparks", sentences related to a scientific concept intended to inspire writers. We find that our sparks are more coherent and diverse than a competitive language model baseline, and approach a human-created gold standard. In a study with 13 PhD students writing on topics of their own selection, we find three main use cases of sparks: aiding with crafting detailed sentences, providing interesting angles to engage readers, and demonstrating common reader perspectives. We also report on the various reasons sparks were considered unhelpful, and discuss how we might improve language models as writing support tools.