Nikolas Martelaro

HC
h-index62
17papers
329citations
Novelty39%
AI Score47

17 Papers

HCMar 1, 2023
Exploring Challenges and Opportunities to Support Designers in Learning to Co-create with AI-based Manufacturing Design Tools

Frederic Gmeiner, Humphrey Yang, Lining Yao et al. · cmu

AI-based design tools are proliferating in professional software to assist engineering and industrial designers in complex manufacturing and design tasks. These tools take on more agentic roles than traditional computer-aided design tools and are often portrayed as "co-creators." Yet, working effectively with such systems requires different skills than working with complex CAD tools alone. To date, we know little about how engineering designers learn to work with AI-based design tools. In this study, we observed trained designers as they learned to work with two AI-based tools on a realistic design task. We find that designers face many challenges in learning to effectively co-create with current systems, including challenges in understanding and adjusting AI outputs and in communicating their design goals. Based on our findings, we highlight several design opportunities to better support designer-AI co-creation.

HCNov 22, 2022
VideoMap: Supporting Video Editing Exploration, Brainstorming, and Prototyping in the Latent Space

David Chuan-En Lin, Fabian Caba Heilbron, Joon-Young Lee et al. · cmu

Video editing is a creative and complex endeavor and we believe that there is potential for reimagining a new video editing interface to better support the creative and exploratory nature of video editing. We take inspiration from latent space exploration tools that help users find patterns and connections within complex datasets. We present VideoMap, a proof-of-concept video editing interface that operates on video frames projected onto a latent space. We support intuitive navigation through map-inspired navigational elements and facilitate transitioning between different latent spaces through swappable lenses. We built three VideoMap components to support editors in three common video tasks. In a user study with both professionals and non-professionals, editors found that VideoMap helps reduce grunt work, offers a user-friendly experience, provides an inspirational way of editing, and effectively supports the exploratory nature of video editing. We further demonstrate the versatility of VideoMap by implementing three extended applications. For interactive examples, we invite you to visit our project page: https://humanvideointeraction.github.io/videomap.

CYOct 10, 2023
The AI Incident Database as an Educational Tool to Raise Awareness of AI Harms: A Classroom Exploration of Efficacy, Limitations, & Future Improvements

Michael Feffer, Nikolas Martelaro, Hoda Heidari · cmu

Prior work has established the importance of integrating AI ethics topics into computer and data sciences curricula. We provide evidence suggesting that one of the critical objectives of AI Ethics education must be to raise awareness of AI harms. While there are various sources to learn about such harms, The AI Incident Database (AIID) is one of the few attempts at offering a relatively comprehensive database indexing prior instances of harms or near harms stemming from the deployment of AI technologies in the real world. This study assesses the effectiveness of AIID as an educational tool to raise awareness regarding the prevalence and severity of AI harms in socially high-stakes domains. We present findings obtained through a classroom study conducted at an R1 institution as part of a course focused on the societal and ethical considerations around AI and ML. Our qualitative findings characterize students' initial perceptions of core topics in AI ethics and their desire to close the educational gap between their technical skills and their ability to think systematically about ethical and societal aspects of their work. We find that interacting with the database helps students better understand the magnitude and severity of AI harms and instills in them a sense of urgency around (a) designing functional and safe AI and (b) strengthening governance and accountability mechanisms. Finally, we compile students' feedback about the tool and our class activity into actionable recommendations for the database development team and the broader community to improve awareness of AI harms in AI ethics education.

HCOct 12, 2023
Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation Models

David Chuan-En Lin, Nikolas Martelaro · cmu

Recent advancements in AI foundation models have made it possible for them to be utilized off-the-shelf for creative tasks, including ideating design concepts or generating visual prototypes. However, integrating these models into the creative process can be challenging as they often exist as standalone applications tailored to specific tasks. To address this challenge, we introduce Jigsaw, a prototype system that employs puzzle pieces as metaphors to represent foundation models. Jigsaw allows designers to combine different foundation model capabilities across various modalities by assembling compatible puzzle pieces. To inform the design of Jigsaw, we interviewed ten designers and distilled design goals. In a user study, we showed that Jigsaw enhanced designers' understanding of available foundation model capabilities, provided guidance on combining capabilities across different modalities and tasks, and served as a canvas to support design exploration, prototyping, and documentation.

HCJul 6, 2022
Team Learning as a Lens for Designing Human-AI Co-Creative Systems

Frederic Gmeiner, Kenneth Holstein, Nikolas Martelaro · cmu

Generative, ML-driven interactive systems have the potential to change how people interact with computers in creative processes - turning tools into co-creators. However, it is still unclear how we might achieve effective human-AI collaboration in open-ended task domains. There are several known challenges around communication in the interaction with ML-driven systems. An overlooked aspect in the design of co-creative systems is how users can be better supported in learning to collaborate with such systems. Here we reframe human-AI collaboration as a learning problem: Inspired by research on team learning, we hypothesize that similar learning strategies that apply to human-human teams might also increase the collaboration effectiveness and quality of humans working with co-creative generative systems. In this position paper, we aim to promote team learning as a lens for designing more effective co-creative human-AI collaboration and emphasize collaboration process quality as a goal for co-creative systems. Furthermore, we outline a preliminary schematic framework for embedding team learning support in co-creative AI systems. We conclude by proposing a research agenda and posing open questions for further study on supporting people in learning to collaborate with generative AI systems.

CVNov 22, 2022
Videogenic: Identifying Highlight Moments in Videos with Professional Photographs as a Prior

David Chuan-En Lin, Fabian Caba Heilbron, Joon-Young Lee et al. · cmu

This paper investigates the challenge of extracting highlight moments from videos. To perform this task, we need to understand what constitutes a highlight for arbitrary video domains while at the same time being able to scale across different domains. Our key insight is that photographs taken by photographers tend to capture the most remarkable or photogenic moments of an activity. Drawing on this insight, we present Videogenic, a technique capable of creating domain-specific highlight videos for a diverse range of domains. In a human evaluation study (N=50), we show that a high-quality photograph collection combined with CLIP-based retrieval (which uses a neural network with semantic knowledge of images) can serve as an excellent prior for finding video highlights. In a within-subjects expert study (N=12), we demonstrate the usefulness of Videogenic in helping video editors create highlight videos with lighter workload, shorter task completion time, and better usability.

SDNov 9, 2023
What Do I Hear? Generating Sounds for Visuals with ChatGPT

David Chuan-En Lin, Nikolas Martelaro · cmu

This short paper introduces a workflow for generating realistic soundscapes for visual media. In contrast to prior work, which primarily focus on matching sounds for on-screen visuals, our approach extends to suggesting sounds that may not be immediately visible but are essential to crafting a convincing and immersive auditory environment. Our key insight is leveraging the reasoning capabilities of language models, such as ChatGPT. In this paper, we describe our workflow, which includes creating a scene context, brainstorming sounds, and generating the sounds.

47.4AIMar 25
Supervising Ralph Wiggum: Exploring a Metacognitive Co-Regulation Agentic AI Loop for Engineering Design

Zeda Xu, Nikolas Martelaro, Christopher McComb

The engineering design research community has studied agentic AI systems that use Large Language Model (LLM) agents to automate the engineering design process. However, these systems are prone to some of the same pathologies that plague humans. Just as human designers, LLM design agents can fixate on existing paradigms and fail to explore alternatives when solving design challenges, potentially leading to suboptimal solutions. In this work, we propose (1) a novel Self-Regulation Loop (SRL), in which the Design Agent self-regulates and explicitly monitors its own metacognition, and (2) a novel Co-Regulation Design Agentic Loop (CRDAL), in which a Metacognitive Co-Regulation Agent assists the Design Agent in metacognition to mitigate design fixation, thereby improving system performance for engineering design tasks. In the battery pack design problem examined here, we found that the novel CRDAL system generates designs with better performance, without significantly increasing the computational cost, compared to a plain Ralph Wiggum Loop (RWL) and the metacognitively self-assessing Self-Regulation Loop (SRL). Also, we found that the CRDAL system navigated through the latent design space more effectively than both SRL and RWL. However, the SRL did not generate designs with significantly better performance than RWL, even though it explored a different region of the design space. The proposed system architectures and findings of this work provide practical implications for future development of agentic AI systems for engineering design.

HCJan 30, 2025
Inkspire: Supporting Design Exploration with Generative AI through Analogical Sketching

David Chuan-En Lin, Hyeonsu B. Kang, Nikolas Martelaro et al. · cmu

With recent advancements in the capabilities of Text-to-Image (T2I) AI models, product designers have begun experimenting with them in their work. However, T2I models struggle to interpret abstract language and the current user experience of T2I tools can induce design fixation rather than a more iterative, exploratory process. To address these challenges, we developed Inkspire, a sketch-driven tool that supports designers in prototyping product design concepts with analogical inspirations and a complete sketch-to-design-to-sketch feedback loop. To inform the design of Inkspire, we conducted an exchange session with designers and distilled design goals for improving T2I interactions. In a within-subjects study comparing Inkspire to ControlNet, we found that Inkspire supported designers with more inspiration and exploration of design ideas, and improved aspects of the co-creative process by allowing designers to effectively grasp the current state of the AI to guide it towards novel design intentions.

HCFeb 26, 2025
Intent Tagging: Exploring Micro-Prompting Interactions for Supporting Granular Human-GenAI Co-Creation Workflows

Frederic Gmeiner, Nicolai Marquardt, Michael Bentley et al.

Despite Generative AI (GenAI) systems' potential for enhancing content creation, users often struggle to effectively integrate GenAI into their creative workflows. Core challenges include misalignment of AI-generated content with user intentions (intent elicitation and alignment), user uncertainty around how to best communicate their intents to the AI system (prompt formulation), and insufficient flexibility of AI systems to support diverse creative workflows (workflow flexibility). Motivated by these challenges, we created IntentTagger: a system for slide creation based on the notion of Intent Tags - small, atomic conceptual units that encapsulate user intent - for exploring granular and non-linear micro-prompting interactions for Human-GenAI co-creation workflows. Our user study with 12 participants provides insights into the value of flexibly expressing intent across varying levels of ambiguity, meta-intent elicitation, and the benefits and challenges of intent tag-driven workflows. We conclude by discussing the broader implications of our findings and design considerations for GenAI-supported content creation workflows.

CLDec 11, 2023
"What's important here?": Opportunities and Challenges of Using LLMs in Retrieving Information from Web Interfaces

Faria Huq, Jeffrey P. Bigham, Nikolas Martelaro

Large language models (LLMs) that have been trained on a corpus that includes large amount of code exhibit a remarkable ability to understand HTML code. As web interfaces are primarily constructed using HTML, we design an in-depth study to see how LLMs can be used to retrieve and locate important elements for a user given query (i.e. task description) in a web interface. In contrast with prior works, which primarily focused on autonomous web navigation, we decompose the problem as an even atomic operation - Can LLMs identify the important information in the web page for a user given query? This decomposition enables us to scrutinize the current capabilities of LLMs and uncover the opportunities and challenges they present. Our empirical experiments show that while LLMs exhibit a reasonable level of performance in retrieving important UI elements, there is still a substantial room for improvement. We hope our investigation will inspire follow-up works in overcoming the current challenges in this domain.

64.3HCApr 6
EcoAssist: Embedding Sustainability into AI-Assisted Frontend Development

André Barrocas, Nuno Jardim Nunes, Valentina Nisi et al.

Frontend code, replicated across millions of page views, consumes significant energy and contributes directly to digital emissions. Yet current AI coding assistants, such as GitHub Copilot and Amazon CodeWhisperer, emphasize developer speed and convenience, with energy impact not yet a primary focus. At the same time, existing energy-focused guidelines and metrics have seen limited adoption among practitioners, leaving a gap between research and everyday coding practice. To address this gap, we introduce EcoAssist, an energy-aware assistant integrated into an IDE that analyzes AI-generated frontend code, estimates its energy footprint, and proposes targeted optimizations. We evaluated EcoAssist through benchmarks of 500 websites and a controlled study with 20 developers. Results show that EcoAssist reduced per-website energy by 13-16% on average, increased developers' awareness of energy use, and maintained developer productivity. This work demonstrates how energy considerations can be embedded directly into AI-assisted coding workflows, supporting developers as they engage with energy implications through actionable feedback.

HCAug 14, 2025
Facilitating Longitudinal Interaction Studies of AI Systems

Tao Long, Sitong Wang, Émilie Fabre et al.

UIST researchers develop tools to address user challenges. However, user interactions with AI evolve over time through learning, adaptation, and repurposing, making one time evaluations insufficient. Capturing these dynamics requires longer-term studies, but challenges in deployment, evaluation design, and data collection have made such longitudinal research difficult to implement. Our workshop aims to tackle these challenges and prepare researchers with practical strategies for longitudinal studies. The workshop includes a keynote, panel discussions, and interactive breakout groups for discussion and hands-on protocol design and tool prototyping sessions. We seek to foster a community around longitudinal system research and promote it as a more embraced method for designing, building, and evaluating UIST tools.

HCJun 15, 2025
Exploring the Potential of Metacognitive Support Agents for Human-AI Co-Creation

Frederic Gmeiner, Kaitao Luo, Ye Wang et al.

Despite the potential of generative AI (GenAI) design tools to enhance design processes, professionals often struggle to integrate AI into their workflows. Fundamental cognitive challenges include the need to specify all design criteria as distinct parameters upfront (intent formulation) and designers' reduced cognitive involvement in the design process due to cognitive offloading, which can lead to insufficient problem exploration, underspecification, and limited ability to evaluate outcomes. Motivated by these challenges, we envision novel metacognitive support agents that assist designers in working more reflectively with GenAI. To explore this vision, we conducted exploratory prototyping through a Wizard of Oz elicitation study with 20 mechanical designers probing multiple metacognitive support strategies. We found that agent-supported users created more feasible designs than non-supported users, with differing impacts between support strategies. Based on these findings, we discuss opportunities and tradeoffs of metacognitive support agents and considerations for future AI-based design tools.

SDDec 17, 2021
Soundify: Matching Sound Effects to Video

David Chuan-En Lin, Anastasis Germanidis, Cristóbal Valenzuela et al.

In the art of video editing, sound helps add character to an object and immerse the viewer within a space. Through formative interviews with professional editors (N=10), we found that the task of adding sounds to video can be challenging. This paper presents Soundify, a system that assists editors in matching sounds to video. Given a video, Soundify identifies matching sounds, synchronizes the sounds to the video, and dynamically adjusts panning and volume to create spatial audio. In a human evaluation study (N=889), we show that Soundify is capable of matching sounds to video out-of-the-box for a diverse range of audio categories. In a within-subjects expert study (N=12), we demonstrate the usefulness of Soundify in helping video editors match sounds to video with lighter workload, reduced task completion time, and improved usability.

CVMay 30, 2021
Learning Personal Style from Few Examples

David Chuan-En Lin, Nikolas Martelaro

A key task in design work is grasping the client's implicit tastes. Designers often do this based on a set of examples from the client. However, recognizing a common pattern among many intertwining variables such as color, texture, and layout and synthesizing them into a composite preference can be challenging. In this paper, we leverage the pattern recognition capability of computational models to aid in this task. We offer a set of principles for computationally learning personal style. The principles are manifested in PseudoClient, a deep learning framework that learns a computational model for personal graphic design style from only a handful of examples. In several experiments, we found that PseudoClient achieves a 79.40% accuracy with only five positive and negative examples, outperforming several alternative methods. Finally, we discuss how PseudoClient can be utilized as a building block to support the development of future design applications.

CYMar 4, 2021
Remote Observation of Field Work on the Farm

Wendy Ju, Ilan Mandel, Kevin Weatherwax et al.

Travel restrictions and social distancing measures make it difficult to observe, monitor or manage physical fieldwork. We describe research in progress that applies technologies for real-time remote observation and conversation in on-road vehicles to observe field work on a farm. We collaborated on a pilot deployment of this project at Kreher Eggs in upstate New York. We instrumented a tractor with equipment to remotely observe and interview farm workers performing vehicle-related work. This work was initially undertaken to allow sustained observation of field work over longer periods of time from geographically distant locales; given our current situation, this work provides a case study in how to perform observational research when geographic and bodily distance have become the norm. We discuss our experiences and provide some preliminary insights for others looking to conduct remote observational research in the field.