Max Kreminski

HC
h-index54
16papers
601citations
Novelty33%
AI Score51

16 Papers

78.4HCJun 3
Creative Reading: Scaffolding Reading for Transformation

Sophia Liu, Sarah Abowitz, Yijun Liu et al.

Reading augmentation systems increasingly help readers process text at scale. While these tools address real constraints of time and cognitive load, they often implicitly frame reading as information transmission, or "reading to discard," delegating interpretation and effort to the machine. Yet this delegation changes the outcome of reading. For example, in scholarly reading, deciding what a research text implies and why it matters is central to the work of scholarly production. We propose creative reading as an alternative goal: reading augmentation that supports readers in creating both readings and themselves as readers. By putting literary and narrative theories into conversation with scholarly sensemaking and creativity support, we present a provocation-oriented design space for valuing the process of reading as a way of preserving a plurality of readings and transforming readers over time.

72.3HCMay 20
Artographer: a Curatorial Interface for Art Space Exploration

Shm Garanganao Almeda, John Joon Young Chung, Sophia Liu et al.

Relating a piece to previously established works is crucial in creating and engaging with art, but AI interfaces tend to obscure such relationships, rather than helping users explore them. Embedding models present new opportunities to support spatially exploring and relating artwork. We built Artographer, an art-exploration system featuring a zoomable 2-D map, constructed from similarity-clustered embeddings of ~16,000 historical artworks. We used Artographer as a design probe to explore how alternative artwork distribution interface design can shape media engagement: we invited 20 participants, including 9 art history scholars, to traverse the map, collecting artworks for a goal-driven task and while freely exploring. We identify values enacted in spatial art discovery (Visibility, Agency, Serendipity, Friction) and consider how these values challenge dominant design paradigms -- in particular, the recommendation systems governing contemporary media distribution platforms. We reimagine a curatorial approach to media distribution, within digital ecosystems where history and culture can thrive.

HCAug 7, 2024
Patchview: LLM-Powered Worldbuilding with Generative Dust and Magnet Visualization

John Joon Young Chung, Max Kreminski

Large language models (LLMs) can help writers build story worlds by generating world elements, such as factions, characters, and locations. However, making sense of many generated elements can be overwhelming. Moreover, if the user wants to precisely control aspects of generated elements that are difficult to specify verbally, prompting alone may be insufficient. We introduce Patchview, a customizable LLM-powered system that visually aids worldbuilding by allowing users to interact with story concepts and elements through the physical metaphor of magnets and dust. Elements in Patchview are visually dragged closer to concepts with high relevance, facilitating sensemaking. The user can also steer the generation with verbally elusive concepts by indicating the desired position of the element between concepts. When the user disagrees with the LLM's visualization and generation, they can correct those by repositioning the element. These corrections can be used to align the LLM's future behaviors to the user's perception. With a user study, we show that Patchview supports the sensemaking of world elements and steering of element generation, facilitating exploration during the worldbuilding process. Patchview provides insights on how customizable visual representation can help sensemake, steer, and align generative AI model behaviors with the user's intentions.

CLNov 12, 2025
LiteraryTaste: A Preference Dataset for Creative Writing Personalization

John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele et al.

People have different creative writing preferences, and large language models (LLMs) for these tasks can benefit from adapting to each user's preferences. However, these models are often trained over a dataset that considers varying personal tastes as a monolith. To facilitate developing personalized creative writing LLMs, we introduce LiteraryTaste, a dataset of reading preferences from 60 people, where each person: 1) self-reported their reading habits and tastes (stated preference), and 2) annotated their preferences over 100 pairs of short creative writing texts (revealed preference). With our dataset, we found that: 1) people diverge on creative writing preferences, 2) finetuning a transformer encoder could achieve 75.8% and 67.7% accuracy when modeling personal and collective revealed preferences, and 3) stated preferences had limited utility in modeling revealed preferences. With an LLM-driven interpretability pipeline, we analyzed how people's preferences vary. We hope our work serves as a cornerstone for personalizing creative writing technologies.

HCJan 26
Design Techniques for LLM-Powered Interactive Storytelling: A Case Study of the Dramamancer System

Tiffany Wang, Yuqian Sun, Yi Wang et al.

The rise of Large Language Models (LLMs) has enabled a new paradigm for bridging authorial intent and player agency in interactive narrative. We consider this paradigm through the example of Dramamancer, a system that uses an LLM to transform author-created story schemas into player-driven playthroughs. This extended abstract outlines some design techniques and evaluation considerations associated with this system.

HCFeb 2, 2024
Homogenization Effects of Large Language Models on Human Creative Ideation

Barrett R. Anderson, Jash Hemant Shah, Max Kreminski

Large language models (LLMs) are now being used in a wide variety of contexts, including as creativity support tools (CSTs) intended to help their users come up with new ideas. But do LLMs actually support user creativity? We hypothesized that the use of an LLM as a CST might make the LLM's users feel more creative, and even broaden the range of ideas suggested by each individual user, but also homogenize the ideas suggested by different users. We conducted a 36-participant comparative user study and found, in accordance with the homogenization hypothesis, that different users tended to produce less semantically distinct ideas with ChatGPT than with an alternative CST. Additionally, ChatGPT users generated a greater number of more detailed ideas, but felt less responsible for the ideas they generated. We discuss potential implications of these findings for users, designers, and developers of LLM-based CSTs.

HCMar 21, 2024
A Design Space for Intelligent and Interactive Writing Assistants

Mina Lee, Katy Ilonka Gero, John Joon Young Chung et al. · allen-ai, deepmind

In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.

CLMar 21, 2025
Modifying Large Language Model Post-Training for Diverse Creative Writing

John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele et al.

As creative writing tasks do not have singular correct answers, large language models (LLMs) trained to perform these tasks should be able to generate diverse valid outputs. However, LLM post-training often focuses on improving generation quality but neglects to facilitate output diversity. Hence, in creative writing generation, we investigate post-training approaches to promote both output diversity and quality. Our core idea is to include deviation -- the degree of difference between a training sample and all other samples with the same prompt -- in the training objective to facilitate learning from rare high-quality instances. By adopting our approach to direct preference optimization (DPO) and odds ratio preference optimization (ORPO), we demonstrate that we can promote the output diversity of trained models while minimally decreasing quality. Our best model with 8B parameters could achieve on-par diversity as a human-created dataset while having output quality similar to the best instruction-tuned models we examined, GPT-4o and DeepSeek-R1. We further validate our approaches with a human evaluation, an ablation, and a comparison to an existing diversification approach, DivPO.

HCApr 16, 2024
The Dearth of the Author in AI-Supported Writing

Max Kreminski

We diagnose and briefly discuss the dearth of the author: a condition that arises when AI-based creativity support tools for writing allow users to produce large amounts of text without making a commensurate number of creative decisions, resulting in output that is sparse in expressive intent. We argue that the dearth of the author helps to explain a number of recurring difficulties and anxieties around AI-based writing support tools, but that it also suggests an ambitious new goal for AI-based CSTs.

HCJan 23, 2025
Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols

John Joon Young Chung, Melissa Roemmele, Max Kreminski

We introduce Toyteller, an AI-powered storytelling system where users generate a mix of story text and visuals by directly manipulating character symbols like they are toy-playing. Anthropomorphized symbol motions can convey rich and nuanced social interactions; Toyteller leverages these motions (1) to let users steer story text generation and (2) as a visual output format that accompanies story text. We enabled motion-steered text generation and text-steered motion generation by mapping motions and text onto a shared semantic space so that large language models and motion generation models can use it as a translational layer. Technical evaluations showed that Toyteller outperforms a competitive baseline, GPT-4o. Our user study identified that toy-playing helps express intentions difficult to verbalize. However, only motions could not express all user intentions, suggesting combining it with other modalities like language. We discuss the design space of toy-playing interactions and implications for technical HCI research on human-AI interaction.

HCMar 8, 2025
Phraselette: A Poet's Procedural Palette

Alex Calderwood, John Joon Young Chung, Yuqian Sun et al.

According to the recently introduced theory of artistic support tools, creativity support tools exert normative influences over artistic production, instantiating a normative ground that shapes both the process and product of artistic expression. We argue that the normative ground of most existing automated writing tools is misaligned with writerly values and identify a potential alternative frame-material writing support-for experimental poetry tools that flexibly support the finding, processing, transforming, and shaping of text(s). Based on this frame, we introduce Phraselette, an artistic material writing support interface that helps experimental poets search for words and phrases. To provide material writing support, Phraselette is designed to counter the dominant mode of automated writing tools, while offering language model affordances in line with writerly values. We further report on an extended expert evaluation involving 10 published poets that indicates support for both our framing of material writing support and for Phraselette itself.

HCJul 4, 2025
Scaffolding Recursive Divergence and Convergence in Story Ideation

Taewook Kim, Matthew Kay, Yuqian Sun et al.

Human creative ideation involves both exploration of diverse ideas (divergence) and selective synthesis of explored ideas into coherent combinations (convergence). While processes of divergence and convergence are often interleaved and nested, existing AI-powered creativity support tools (CSTs) lack support for sophisticated orchestration of divergence and convergence. We present Reverger, an AI-powered CST that helps users ideate variations of conceptual directions for modifying a story by scaffolding flexible iteration between divergence and convergence. For divergence, our tool enables recursive exploration of alternative high-level directions for modifying a specific part of the original story. For convergence, it allows users to collect explored high-level directions and synthesize them into concrete variations. Users can then iterate between divergence and convergence until they find a satisfactory outcome. A within-subject study revealed that Reverger permitted participants to explore more unexpected and diverse high-level directions than a comparable baseline. Reverger users also felt that they had more fine-grained control and discovered more effort-worthy outcomes.

CLJun 11, 2025
Can LLMs Generate Good Stories? Insights and Challenges from a Narrative Planning Perspective

Yi Wang, Max Kreminski

Story generation has been a prominent application of Large Language Models (LLMs). However, understanding LLMs' ability to produce high-quality stories remains limited due to challenges in automatic evaluation methods and the high cost and subjectivity of manual evaluation. Computational narratology offers valuable insights into what constitutes a good story, which has been applied in the symbolic narrative planning approach to story generation. This work aims to deepen the understanding of LLMs' story generation capabilities by using them to solve narrative planning problems. We present a benchmark for evaluating LLMs on narrative planning based on literature examples, focusing on causal soundness, character intentionality, and dramatic conflict. Our experiments show that GPT-4 tier LLMs can generate causally sound stories at small scales, but planning with character intentionality and dramatic conflict remains challenging, requiring LLMs trained with reinforcement learning for complex reasoning. The results offer insights on the scale of stories that LLMs can generate while maintaining quality from different aspects. Our findings also highlight interesting problem solving behaviors and shed lights on challenges and considerations for applying LLM narrative planning in game environments.

CLSep 26, 2025
LLMs Behind the Scenes: Enabling Narrative Scene Illustration

Melissa Roemmele, John Joon Young Chung, Taewook Kim et al.

Generative AI has established the opportunity to readily transform content from one medium to another. This capability is especially powerful for storytelling, where visual illustrations can illuminate a story originally expressed in text. In this paper, we focus on the task of narrative scene illustration, which involves automatically generating an image depicting a scene in a story. Motivated by recent progress on text-to-image models, we consider a pipeline that uses LLMs as an interface for prompting text-to-image models to generate scene illustrations given raw story text. We apply variations of this pipeline to a prominent story corpus in order to synthesize illustrations for scenes in these stories. We conduct a human annotation task to obtain pairwise quality judgments for these illustrations. The outcome of this process is the SceneIllustrations dataset, which we release as a new resource for future work on cross-modal narrative transformation. Through our analysis of this dataset and experiments modeling illustration quality, we demonstrate that LLMs can effectively verbalize scene knowledge implicitly evoked by story text. Moreover, this capability is impactful for generating and evaluating illustrations.

CLJun 1, 2024
Guiding and Diversifying LLM-Based Story Generation via Answer Set Programming

Phoebe J. Wang, Max Kreminski

Instruction-tuned large language models (LLMs) are capable of generating stories in response to open-ended user requests, but the resulting stories tend to be limited in their diversity. Older, symbolic approaches to story generation (such as planning) can generate substantially more diverse plot outlines, but are limited to producing stories that recombine a fixed set of hand-engineered character action templates. Can we combine the strengths of these approaches while mitigating their weaknesses? We propose to do so by using a higher-level and more abstract symbolic specification of high-level story structure -- implemented via answer set programming (ASP) -- to guide and diversify LLM-based story generation. Via semantic similarity analysis, we demonstrate that our approach produces more diverse stories than an unguided LLM, and via code excerpts, we demonstrate the improved compactness and flexibility of ASP-based outline generation over full-fledged narrative planning.

AIJul 12, 2020
Tabletop Roleplaying Games as Procedural Content Generators

Matthew Guzdial, Devi Acharya, Max Kreminski et al.

Tabletop roleplaying games (TTRPGs) and procedural content generators can both be understood as systems of rules for producing content. In this paper, we argue that TTRPG design can usefully be viewed as procedural content generator design. We present several case studies linking key concepts from PCG research -- including possibility spaces, expressive range analysis, and generative pipelines -- to key concepts in TTRPG design. We then discuss the implications of these relationships and suggest directions for future work uniting research in TTRPGs and PCG.