Katharina Weitz

HC
h-index32
8papers
209citations
Novelty46%
AI Score41

8 Papers

60.2AIApr 13
From Attribution to Action: A Human-Centered Application of Activation Steering

Tobias Labarta, Maximilian Dreyer, Katharina Weitz et al.

Explainable AI (XAI) methods reveal which features influence model predictions, yet provide limited means for practitioners to act on these explanations. Activation steering of components identified via XAI offers a path toward actionable explanations, although its practical utility remains understudied. We introduce an interactive workflow combining SAE-based attribution with activation steering for instance-level analysis of concept usage in vision models, implemented as a web-based tool. Based on this workflow, we conduct semi-structured expert interviews (N=8) with debugging tasks on CLIP to investigate how practitioners reason about, trust, and apply activation steering. We find that steering enables a shift from inspection to intervention-based hypothesis testing (8/8 participants), with most grounding trust in observed model responses rather than explanation plausibility alone (6/8). Participants adopted systematic debugging strategies dominated by component suppression (7/8) and highlighted risks including ripple effects and limited generalization of instance-level corrections. Overall, activation steering renders interpretability more actionable while raising important considerations for safe and effective use.

AIJul 19, 2022
Alterfactual Explanations -- The Relevance of Irrelevance for Explaining AI Systems

Silvan Mertes, Christina Karle, Tobias Huber et al.

Explanation mechanisms from the field of Counterfactual Thinking are a widely-used paradigm for Explainable Artificial Intelligence (XAI), as they follow a natural way of reasoning that humans are familiar with. However, all common approaches from this field are based on communicating information about features or characteristics that are especially important for an AI's decision. We argue that in order to fully understand a decision, not only knowledge about relevant features is needed, but that the awareness of irrelevant information also highly contributes to the creation of a user's mental model of an AI system. Therefore, we introduce a new way of explaining AI systems. Our approach, which we call Alterfactual Explanations, is based on showing an alternative reality where irrelevant features of an AI's input are altered. By doing so, the user directly sees which characteristics of the input data can change arbitrarily without influencing the AI's decision. We evaluate our approach in an extensive user study, revealing that it is able to significantly contribute to the participants' understanding of an AI. We show that alterfactual explanations are suited to convey an understanding of different aspects of the AI's reasoning than established counterfactual explanation methods.

HCOct 7, 2022
What Do End-Users Really Want? Investigation of Human-Centered XAI for Mobile Health Apps

Katharina Weitz, Alexander Zellner, Elisabeth André

In healthcare, AI systems support clinicians and patients in diagnosis, treatment, and monitoring, but many systems' poor explainability remains challenging for practical application. Overcoming this barrier is the goal of explainable AI (XAI). However, an explanation can be perceived differently and, thus, not solve the black-box problem for everyone. The domain of Human-Centered AI deals with this problem by adapting AI to users. We present a user-centered persona concept to evaluate XAI and use it to investigate end-users preferences for various explanation styles and contents in a mobile health stress monitoring application. The results of our online survey show that users' demographics and personality, as well as the type of explanation, impact explanation preferences, indicating that these are essential features for XAI design. We subsumed the results in three prototypical user personas: power-, casual-, and privacy-oriented users. Our insights bring an interactive, human-centered XAI closer to practical application.

HCOct 7, 2022
Do We Need Explainable AI in Companies? Investigation of Challenges, Expectations, and Chances from Employees' Perspective

Katharina Weitz, Chi Tai Dang, Elisabeth André

Companies' adoption of artificial intelligence (AI) is increasingly becoming an essential element of business success. However, using AI poses new requirements for companies and their employees, including transparency and comprehensibility of AI systems. The field of Explainable AI (XAI) aims to address these issues. Yet, the current research primarily consists of laboratory studies, and there is a need to improve the applicability of the findings to real-world situations. Therefore, this project report paper provides insights into employees' needs and attitudes towards (X)AI. For this, we investigate employees' perspectives on (X)AI. Our findings suggest that AI and XAI are well-known terms perceived as important for employees. This recognition is a critical first step for XAI to potentially drive successful usage of AI by providing comprehensible insights into AI technologies. In a lessons-learned section, we discuss the open questions identified and suggest future research directions to develop human-centered XAI designs for companies. By providing insights into employees' needs and attitudes towards (X)AI, our project report contributes to the development of XAI solutions that meet the requirements of companies and their employees, ultimately driving the successful adoption of AI technologies in the business context.

CVMay 8, 2024
Relevant Irrelevance: Generating Alterfactual Explanations for Image Classifiers

Silvan Mertes, Tobias Huber, Christina Karle et al.

In this paper, we demonstrate the feasibility of alterfactual explanations for black box image classifiers. Traditional explanation mechanisms from the field of Counterfactual Thinking are a widely-used paradigm for Explainable Artificial Intelligence (XAI), as they follow a natural way of reasoning that humans are familiar with. However, most common approaches from this field are based on communicating information about features or characteristics that are especially important for an AI's decision. However, to fully understand a decision, not only knowledge about relevant features is needed, but the awareness of irrelevant information also highly contributes to the creation of a user's mental model of an AI system. To this end, a novel approach for explaining AI systems called alterfactual explanations was recently proposed on a conceptual level. It is based on showing an alternative reality where irrelevant features of an AI's input are altered. By doing so, the user directly sees which input data characteristics can change arbitrarily without influencing the AI's decision. In this paper, we show for the first time that it is possible to apply this idea to black box models based on neural networks. To this end, we present a GAN-based approach to generate these alterfactual explanations for binary image classifiers. Further, we present a user study that gives interesting insights on how alterfactual explanations can complement counterfactual explanations.

HCMay 9, 2025
See What I Mean? CUE: A Cognitive Model of Understanding Explanations

Tobias Labarta, Nhi Hoang, Katharina Weitz et al.

As machine learning systems increasingly inform critical decisions, the need for human-understandable explanations grows. Current evaluations of Explainable AI (XAI) often prioritize technical fidelity over cognitive accessibility which critically affects users, in particular those with visual impairments. We propose CUE, a model for Cognitive Understanding of Explanations, linking explanation properties to cognitive sub-processes: legibility (perception), readability (comprehension), and interpretability (interpretation). In a study (N=455) testing heatmaps with varying colormaps (BWR, Cividis, Coolwarm), we found comparable task performance but lower confidence/effort for visually impaired users. Unlike expected, these gaps were not mitigated and sometimes worsened by accessibility-focused color maps like Cividis. These results challenge assumptions about perceptual optimization and support the need for adaptive XAI interfaces. They also validate CUE by demonstrating that altering explanation legibility affects understandability. We contribute: (1) a formalized cognitive model for explanation understanding, (2) an integrated definition of human-centered explanation properties, and (3) empirical evidence motivating accessible, user-tailored XAI.

LGDec 22, 2020
GANterfactual - Counterfactual Explanations for Medical Non-Experts using Generative Adversarial Learning

Silvan Mertes, Tobias Huber, Katharina Weitz et al.

With the ongoing rise of machine learning, the need for methods for explaining decisions made by artificial intelligence systems is becoming a more and more important topic. Especially for image classification tasks, many state-of-the-art tools to explain such classifiers rely on visual highlighting of important areas of the input data. Contrary, counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image in a way such that the classifier would have made a different prediction. By doing so, the users of counterfactual explanation systems are equipped with a completely different kind of explanatory information. However, methods for generating realistic counterfactual explanations for image classifiers are still rare. Especially in medical contexts, where relevant information often consists of textural and structural information, high-quality counterfactual images have the potential to give meaningful insights into decision processes. In this work, we present GANterfactual, an approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques. Additionally, we conduct a user study to evaluate our approach in an exemplary medical use case. Our results show that, in the chosen medical use-case, counterfactual explanations lead to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the-art systems that work with saliency maps, namely LIME and LRP.

LGMay 18, 2020
Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps

Tobias Huber, Katharina Weitz, Elisabeth André et al.

With advances in reinforcement learning (RL), agents are now being developed in high-stakes application domains such as healthcare and transportation. Explaining the behavior of these agents is challenging, as the environments in which they act have large state spaces, and their decision-making can be affected by delayed rewards, making it difficult to analyze their behavior. To address this problem, several approaches have been developed. Some approaches attempt to convey the $\textit{global}$ behavior of the agent, describing the actions it takes in different states. Other approaches devised $\textit{local}$ explanations which provide information regarding the agent's decision-making in a particular state. In this paper, we combine global and local explanation methods, and evaluate their joint and separate contributions, providing (to the best of our knowledge) the first user study of combined local and global explanations for RL agents. Specifically, we augment strategy summaries that extract important trajectories of states from simulations of the agent with saliency maps which show what information the agent attends to. Our results show that the choice of what states to include in the summary (global information) strongly affects people's understanding of agents: participants shown summaries that included important states significantly outperformed participants who were presented with agent behavior in a randomly set of chosen world-states. We find mixed results with respect to augmenting demonstrations with saliency maps (local information), as the addition of saliency maps did not significantly improve performance in most cases. However, we do find some evidence that saliency maps can help users better understand what information the agent relies on in its decision making, suggesting avenues for future work that can further improve explanations of RL agents.