CYApr 12, 2023
Positive AI: Key Challenges in Designing Artificial Intelligence for WellbeingWillem van der Maden, Derek Lomas, Malak Sadek et al.
Artificial Intelligence (AI) is a double-edged sword: on one hand, AI promises to provide great advances that could benefit humanity, but on the other hand, AI poses substantial (even existential) risks. With advancements happening daily, many people are increasingly worried about AI's impact on their lives. To ensure AI progresses beneficially, some researchers have proposed "wellbeing" as a key objective to govern AI. This article addresses key challenges in designing AI for wellbeing. We group these challenges into issues of modeling wellbeing in context, assessing wellbeing in context, designing interventions to improve wellbeing, and maintaining AI alignment with wellbeing over time. The identification of these challenges provides a scope for efforts to help ensure that AI developments are aligned with human wellbeing.
SEJan 25
Results-Actionability Gap: Understanding How Practitioners Evaluate LLM Products in the WildWillem van der Maden, Malak Sadek, Ziang Xiao et al.
How do product teams evaluate LLM-powered products? As organizations integrate large language models (LLMs) into digital products, their unpredictable nature makes traditional evaluation approaches inadequate, yet little is known about how practitioners navigate this challenge. Through interviews with nineteen practitioners across diverse sectors, we identify ten evaluation practices spanning informal 'vibe checks' to organizational meta-work. Beyond confirming four documented challenges, we introduce a novel fifth we call the results-actionability gap, in which practitioners gather evaluation data but cannot translate findings into concrete improvements. Drawing on patterns from successful teams, we contribute strategies to bridge this gap, supporting practitioners' formalization journey from ad-hoc interpretive practices (e.g., vibe checks) toward systematic evaluation. Our analysis suggests these interpretive practices are necessary adaptations to LLM characteristics rather than methodological failures. For HCI researchers, this presents a research opportunity to support practitioners in systematizing emerging practices rather than developing new evaluation frameworks.
CYMar 11, 2025
When Discourse Stalls: Moving Past Five Semantic Stopsigns about Generative AI in Design ResearchWillem van der Maden, Vera van der Burg, Brett A. Halperin et al.
This essay examines how Generative AI (GenAI) is rapidly transforming design practices and how discourse often falls into over-simplified narratives that impede meaningful research and practical progress. We identify and deconstruct five prevalent "semantic stopsigns" -- reductive framings about GenAI in design that halt deeper inquiry and limit productive engagement. Reflecting upon two expert workshops at ACM conferences and semi-structured interviews with design practitioners, we analyze how these stopsigns manifest in research and practice. Our analysis develops mid-level knowledge that bridges theoretical discourse and practical implementation, helping designers and researchers interrogate common assumptions about GenAI in their own contexts. By recasting these stopsigns into more nuanced frameworks, we provide the design research community with practical approaches for thinking about and working with these emerging technologies.
AIFeb 2, 2024
Developing and Evaluating a Design Method for Positive Artificial IntelligenceWillem van der Maden, Derek Lomas, Paul Hekkert
As artificial intelligence (AI) continues advancing, ensuring positive societal impacts becomes critical, especially as AI systems become increasingly ubiquitous in various aspects of life. However, developing "AI for good" poses substantial challenges around aligning systems with complex human values. Presently, we lack mature methods for addressing these challenges. This article presents and evaluates the Positive AI design method aimed at addressing this gap. The method provides a human-centered process to translate wellbeing aspirations into concrete practices. First, we explain the method's four key steps: contextualizing, operationalizing, optimizing, and implementing wellbeing supported by continuous measurement for feedback cycles. We then present a multiple case study where novice designers applied the method, revealing strengths and weaknesses related to efficacy and usability. Next, an expert evaluation study assessed the quality of the resulting concepts, rating them moderately high for feasibility, desirability, and plausibility of achieving intended wellbeing benefits. Together, these studies provide preliminary validation of the method's ability to improve AI design, while surfacing areas needing refinement like developing support for complex steps. Proposed adaptations such as examples and evaluation heuristics could address weaknesses. Further research should examine sustained application over multiple projects. This human-centered approach shows promise for realizing the vision of 'AI for Wellbeing' that does not just avoid harm, but actively benefits humanity.