Katrien Verbert

HC
h-index48
15papers
660citations
Novelty37%
AI Score36

15 Papers

HCFeb 21, 2023
Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations

Aditya Bhattacharya, Jeroen Ooge, Gregor Stiglic et al.

Explainable artificial intelligence is increasingly used in machine learning (ML) based decision-making systems in healthcare. However, little research has compared the utility of different explanation methods in guiding healthcare experts for patient care. Moreover, it is unclear how useful, understandable, actionable and trustworthy these methods are for healthcare experts, as they often require technical ML knowledge. This paper presents an explanation dashboard that predicts the risk of diabetes onset and explains those predictions with data-centric, feature-importance, and example-based explanations. We designed an interactive dashboard to assist healthcare experts, such as nurses and physicians, in monitoring the risk of diabetes onset and recommending measures to minimize risk. We conducted a qualitative study with 11 healthcare experts and a mixed-methods study with 45 healthcare experts and 51 diabetic patients to compare the different explanation methods in our dashboard in terms of understandability, usefulness, actionability, and trust. Results indicate that our participants preferred our representation of data-centric explanations that provide local explanations with a global overview over other methods. Therefore, this paper highlights the importance of visually directive data-centric explanation method for assisting healthcare experts to gain actionable insights from patient health records. Furthermore, we share our design implications for tailoring the visual representation of different explanation methods for healthcare experts.

HCJul 31, 2023
Towards a Comprehensive Human-Centred Evaluation Framework for Explainable AI

Ivania Donoso-Guzmán, Jeroen Ooge, Denis Parra et al.

While research on explainable AI (XAI) is booming and explanation techniques have proven promising in many application domains, standardised human-centred evaluation procedures are still missing. In addition, current evaluation procedures do not assess XAI methods holistically in the sense that they do not treat explanations' effects on humans as a complex user experience. To tackle this challenge, we propose to adapt the User-Centric Evaluation Framework used in recommender systems: we integrate explanation aspects, summarise explanation properties, indicate relations between them, and categorise metrics that measure these properties. With this comprehensive evaluation framework, we hope to contribute to the human-centred standardisation of XAI evaluation.

LGOct 3, 2023
Lessons Learned from EXMOS User Studies: A Technical Report Summarizing Key Takeaways from User Studies Conducted to Evaluate The EXMOS Platform

Aditya Bhattacharya, Simone Stumpf, Lucija Gosak et al.

In the realm of interactive machine-learning systems, the provision of explanations serves as a vital aid in the processes of debugging and enhancing prediction models. However, the extent to which various global model-centric and data-centric explanations can effectively assist domain experts in detecting and resolving potential data-related issues for the purpose of model improvement has remained largely unexplored. In this technical report, we summarise the key findings of our two user studies. Our research involved a comprehensive examination of the impact of global explanations rooted in both data-centric and model-centric perspectives within systems designed to support healthcare experts in optimising machine learning models through both automated and manual data configurations. To empirically investigate these dynamics, we conducted two user studies, comprising quantitative analysis involving a sample size of 70 healthcare experts and qualitative assessments involving 30 healthcare experts. These studies were aimed at illuminating the influence of different explanation types on three key dimensions: trust, understandability, and model improvement. Results show that global model-centric explanations alone are insufficient for effectively guiding users during the intricate process of data configuration. In contrast, data-centric explanations exhibited their potential by enhancing the understanding of system changes that occur post-configuration. However, a combination of both showed the highest level of efficacy for fostering trust, improving understandability, and facilitating model enhancement among healthcare experts. We also present essential implications for developing interactive machine-learning systems driven by explanations. These insights can guide the creation of more effective systems that empower domain experts to harness the full potential of machine learning

AIFeb 1, 2024
EXMOS: Explanatory Model Steering Through Multifaceted Explanations and Data Configurations

Aditya Bhattacharya, Simone Stumpf, Lucija Gosak et al.

Explanations in interactive machine-learning systems facilitate debugging and improving prediction models. However, the effectiveness of various global model-centric and data-centric explanations in aiding domain experts to detect and resolve potential data issues for model improvement remains unexplored. This research investigates the influence of data-centric and model-centric global explanations in systems that support healthcare experts in optimising models through automated and manual data configurations. We conducted quantitative (n=70) and qualitative (n=30) studies with healthcare experts to explore the impact of different explanations on trust, understandability and model improvement. Our results reveal the insufficiency of global model-centric explanations for guiding users during data configuration. Although data-centric explanations enhanced understanding of post-configuration system changes, a hybrid fusion of both explanation types demonstrated the highest effectiveness. Based on our study results, we also present design implications for effective explanation-driven interactive machine-learning systems.

HCMay 17, 2024
An Explanatory Model Steering System for Collaboration between Domain Experts and AI

Aditya Bhattacharya, Simone Stumpf, Katrien Verbert

With the increasing adoption of Artificial Intelligence (AI) systems in high-stake domains, such as healthcare, effective collaboration between domain experts and AI is imperative. To facilitate effective collaboration between domain experts and AI systems, we introduce an Explanatory Model Steering system that allows domain experts to steer prediction models using their domain knowledge. The system includes an explanation dashboard that combines different types of data-centric and model-centric explanations and allows prediction models to be steered through manual and automated data configuration approaches. It allows domain experts to apply their prior knowledge for configuring the underlying training data and refining prediction models. Additionally, our model steering system has been evaluated for a healthcare-focused scenario with 174 healthcare experts through three extensive user studies. Our findings highlight the importance of involving domain experts during model steering, ultimately leading to improved human-AI collaboration.

HCDec 26, 2024
Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems

Aditya Bhattacharya, Simone Stumpf, Robin De Croon et al.

Representation bias is one of the most common types of biases in artificial intelligence (AI) systems, causing AI models to perform poorly on underrepresented data segments. Although AI practitioners use various methods to reduce representation bias, their effectiveness is often constrained by insufficient domain knowledge in the debiasing process. To address this gap, this paper introduces a set of generic design guidelines for effectively involving domain experts in representation debiasing. We instantiated our proposed guidelines in a healthcare-focused application and evaluated them through a comprehensive mixed-methods user study with 35 healthcare experts. Our findings show that involving domain experts can reduce representation bias without compromising model accuracy. Based on our findings, we also offer recommendations for developers to build robust debiasing systems guided by our generic design guidelines, ensuring more effective inclusion of domain experts in the debiasing process.

HCJun 25, 2025
Visual-Conversational Interface for Evidence-Based Explanation of Diabetes Risk Prediction

Reza Samimi, Aditya Bhattacharya, Lucija Gosak et al.

Healthcare professionals need effective ways to use, understand, and validate AI-driven clinical decision support systems. Existing systems face two key limitations: complex visualizations and a lack of grounding in scientific evidence. We present an integrated decision support system that combines interactive visualizations with a conversational agent to explain diabetes risk assessments. We propose a hybrid prompt handling approach combining fine-tuned language models for analytical queries with general Large Language Models (LLMs) for broader medical questions, a methodology for grounding AI explanations in scientific evidence, and a feature range analysis technique to support deeper understanding of feature contributions. We conducted a mixed-methods study with 30 healthcare professionals and found that the conversational interactions helped healthcare professionals build a clear understanding of model assessments, while the integration of scientific evidence calibrated trust in the system's decisions. Most participants reported that the system supported both patient risk evaluation and recommendation.

HCJun 16, 2025
A Systematic Review of User-Centred Evaluation of Explainable AI in Healthcare

Ivania Donoso-Guzmán, Kristýna Sirka Kacafírková, Maxwell Szymanski et al.

Despite promising developments in Explainable Artificial Intelligence, the practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential, not only to produce understandable explanations but also to ensure their trustworthiness and usability for intended users, but tends to be overlooked because of no clear guidelines on how to design an evaluation with users. This study addresses this gap with two main goals: (1) to develop a framework of well-defined, atomic properties that characterise the user experience of XAI in healthcare; and (2) to provide clear, context-sensitive guidelines for defining evaluation strategies based on system characteristics. We conducted a systematic review of 82 user studies, sourced from five databases, all situated within healthcare settings and focused on evaluating AI-generated explanations. The analysis was guided by a predefined coding scheme informed by an existing evaluation framework, complemented by inductive codes developed iteratively. The review yields three key contributions: (1) a synthesis of current evaluation practices, highlighting a growing focus on human-centred approaches in healthcare XAI; (2) insights into the interrelations among explanation properties; and (3) an updated framework and a set of actionable guidelines to support interdisciplinary teams in designing and implementing effective evaluation strategies for XAI systems tailored to specific application contexts.

CYMay 22, 2025
Let's Get You Hired: A Job Seeker's Perspective on Multi-Agent Recruitment Systems for Explaining Hiring Decisions

Aditya Bhattacharya, Katrien Verbert

During job recruitment, traditional applicant selection methods often lack transparency. Candidates are rarely given sufficient justifications for recruiting decisions, whether they are made manually by human recruiters or through the use of black-box Applicant Tracking Systems (ATS). To address this problem, our work introduces a multi-agent AI system that uses Large Language Models (LLMs) to guide job seekers during the recruitment process. Using an iterative user-centric design approach, we first conducted a two-phased exploratory study with four active job seekers to inform the design and development of the system. Subsequently, we conducted an in-depth, qualitative user study with 20 active job seekers through individual one-to-one interviews to evaluate the developed prototype. The results of our evaluation demonstrate that participants perceived our multi-agent recruitment system as significantly more actionable, trustworthy, and fair compared to traditional methods. Our study further helped us uncover in-depth insights into factors contributing to these perceived user experiences. Drawing from these insights, we offer broader design implications for building user-aligned, multi-agent explainable AI systems across diverse domains.

HCJun 26, 2024
Petal-X: Human-Centered Visual Explanations to Improve Cardiovascular Risk Communication

Diego Rojo, Houda Lamqaddam, Lucija Gosak et al.

Cardiovascular diseases (CVDs), the leading cause of death worldwide, can be prevented in most cases through behavioral interventions. Therefore, effective communication of CVD risk and projected risk reduction by risk factor modification plays a crucial role in reducing CVD risk at the individual level. However, despite interest in refining risk estimation with improved prediction models such as SCORE2, the guidelines for presenting these risk estimations in clinical practice remained essentially unchanged in the last few years, with graphical score charts (GSCs) continuing to be one of the prevalent systems. This work describes the design and implementation of Petal-X, a novel tool to support clinician-patient shared decision-making by explaining the CVD risk contributions of different factors and facilitating what-if analysis. Petal-X relies on a novel visualization, Petal Product Plots, and a tailor-made global surrogate model of SCORE2, whose fidelity is comparable to that of the GSCs used in clinical practice. We evaluated Petal-X compared to GSCs in a controlled experiment with 88 healthcare students, all but one with experience with chronic patients. The results show that Petal-X outperforms GSC in critical tasks, such as comparing the contribution to the patient's 10-year CVD risk of each modifiable risk factor, without a significant loss of perceived transparency, trust, or intent to use. Our study provides an innovative approach to the visualization and explanation of risk in clinical practice that, due to its model-agnostic nature, could continue to support next-generation artificial intelligence risk assessment models.

SIJan 23, 2022
Perceptual Effects of Hierarchy in Art Historical Social Networks

Houda Lamqaddam, Inez De Prekel, Koenraad Brosens et al.

Network representation is a crucial topic in historical social network analysis. The debate around their value and connotations, led by humanist scholars, is today more relevant than ever, seeing how common these representations are as support for historical analysis. Force-directed networks, in particular, are popular as they can be developed relatively quickly, and reveal patterns and structures in data. The underlying algorithms, although powerful in revealing hidden patterns, do not retain meaningful structure and existing hierarchies within historical social networks. In this article, we question the foreign aspect of this structure that force-directed layout create in historical datasets. We explore the importance of hierarchy in social networks, and investigate whether hierarchies -- strongly present within our models of social structure -- affect our perception of social network data. Results from our user evaluation indicate that hierarchical network representations reduce cognitive load and leads to more frequent and deeper insights into historical social networks. We also find that users report a preference for the hierarchical graph representation. We analyse these findings in light of the broader discussion on the value of force-directed network representations within humanistic research, and introduce open questions for future work in this line of research.

HCSep 16, 2021
Trust in Prediction Models: a Mixed-Methods Pilot Study on the Impact of Domain Expertise

Jeroen Ooge, Katrien Verbert

People's trust in prediction models can be affected by many factors, including domain expertise like knowledge about the application domain and experience with predictive modelling. However, to what extent and why domain expertise impacts people's trust is not entirely clear. In addition, accurately measuring people's trust remains challenging. We share our results and experiences of an exploratory pilot study in which four people experienced with predictive modelling systematically explore a visual analytics system with an unknown prediction model. Through a mixed-methods approach involving Likert-type questions and a semi-structured interview, we investigate how people's trust evolves during their exploration, and we distil six themes that affect their trust in the prediction model. Our results underline the multi-faceted nature of trust, and suggest that domain expertise alone cannot fully predict people's trust perceptions.

HCJan 28, 2021
AHMoSe: A Knowledge-Based Visual Support System for Selecting Regression Machine Learning Models

Diego Rojo, Nyi Nyi Htun, Denis Parra et al.

Decision support systems have become increasingly popular in the domain of agriculture. With the development of automated machine learning, agricultural experts are now able to train, evaluate and make predictions using cutting edge machine learning (ML) models without the need for much ML knowledge. Although this automated approach has led to successful results in many scenarios, in certain cases (e.g., when few labeled datasets are available) choosing among different models with similar performance metrics is a difficult task. Furthermore, these systems do not commonly allow users to incorporate their domain knowledge that could facilitate the task of model selection, and to gain insight into the prediction system for eventual decision making. To address these issues, in this paper we present AHMoSe, a visual support system that allows domain experts to better understand, diagnose and compare different regression models, primarily by enriching model-agnostic explanations with domain knowledge. To validate AHMoSe, we describe a use case scenario in the viticulture domain, grape quality prediction, where the system enables users to diagnose and select prediction models that perform better. We also discuss feedback concerning the design of the tool from both ML and viticulture experts.

HCAug 7, 2020
Middle-Aged Video Consumers' Beliefs About Algorithmic Recommendations on YouTube

Oscar Alvarado, Hendrik Heuer, Vero Vanden Abeele et al.

User beliefs about algorithmic systems are constantly co-produced through user interaction and the complex socio-technical systems that generate recommendations. Identifying these beliefs is crucial because they influence how users interact with recommendation algorithms. With no prior work on user beliefs of algorithmic video recommendations, practitioners lack relevant knowledge to improve the user experience of such systems. To address this problem, we conducted semi-structured interviews with middle-aged YouTube video consumers to analyze their user beliefs about the video recommendation system. Our analysis revealed different factors that users believe influence their recommendations. Based on these factors, we identified four groups of user beliefs: Previous Actions, Social Media, Recommender System, and Company Policy. Additionally, we propose a framework to distinguish the four main actors that users believe influence their video recommendations: the current user, other users, the algorithm, and the organization. This framework provides a new lens to explore design suggestions based on the agency of these four actors. It also exposes a novel aspect previously unexplored: the effect of corporate decisions on the interaction with algorithmic recommendations. While we found that users are aware of the existence of the recommendation system on YouTube, we show that their understanding of this system is limited.

LGFeb 20, 2020
Interpretability of machine learning based prediction models in healthcare

Gregor Stiglic, Primoz Kocbek, Nino Fijacko et al.

There is a need of ensuring machine learning models that are interpretable. Higher interpretability of the model means easier comprehension and explanation of future predictions for end-users. Further, interpretable machine learning models allow healthcare experts to make reasonable and data-driven decisions to provide personalized decisions that can ultimately lead to higher quality of service in healthcare. Generally, we can classify interpretability approaches in two groups where the first focuses on personalized interpretation (local interpretability) while the second summarizes prediction models on a population level (global interpretability). Alternatively, we can group interpretability methods into model-specific techniques, which are designed to interpret predictions generated by a specific model, such as a neural network, and model-agnostic approaches, which provide easy-to-understand explanations of predictions made by any machine learning model. Here, we give an overview of interpretability approaches and provide examples of practical interpretability of machine learning in different areas of healthcare, including prediction of health-related outcomes, optimizing treatments or improving the efficiency of screening for specific conditions. Further, we outline future directions for interpretable machine learning and highlight the importance of developing algorithmic solutions that can enable machine-learning driven decision making in high-stakes healthcare problems.