Gunnar König

ML
h-index48
16papers
668citations
Novelty50%
AI Score50

16 Papers

LGJun 3
Validity Threats for Foundation Model Research

Gunnar König, Martin Pawelczyk, Ulrike von Luxburg et al.

Controlled experiments are the backbone of machine learning research, but at the scale of modern foundation models, they have become prohibitively expensive. Instead, the community increasingly relies on research strategies that approximate the ideal experiment at a fraction of the cost: proxy experiments and scaling laws, observational studies with publicly available models, and single-run designs that leverage variation within individual training runs. In this work, we argue that there is no free lunch when approximating large-scale experiments on a compute budget. Specifically, savings in compute come at the cost of validity threats -- hidden and sometimes untestable assumptions that, when violated, can invalidate research claims. To help navigate such threats, we propose an evaluation framework that casts foundation model research as a causal inference problem. Within this framework, we evaluate different research strategies through four types of validity adapted from the empirical social sciences -- statistical, internal, external, and construct validity. We find that each strategy comes with a characteristic validity profile: proxy experiments trade external and construct validity for statistical and internal validity; observational studies face confounding and effect heterogeneity; and single-run designs are strained by interference between treated units. This analysis reveals several validity threats that have received insufficient attention in the literature. Overall, our evaluation framework provides researchers with a practical toolkit for scrutinizing validity threats in foundation model research~designs.

MLOct 27, 2022
Improvement-Focused Causal Recourse (ICR)

Gunnar König, Timo Freiesleben, Moritz Grosse-Wentrup

Algorithmic recourse recommendations, such as Karimi et al.'s (2021) causal recourse (CR), inform stakeholders of how to act to revert unfavourable decisions. However, some actions lead to acceptance (i.e., revert the model's decision) but do not lead to improvement (i.e., may not revert the underlying real-world state). To recommend such actions is to recommend fooling the predictor. We introduce a novel method, Improvement-Focused Causal Recourse (ICR), which involves a conceptual shift: Firstly, we require ICR recommendations to guide towards improvement. Secondly, we do not tailor the recommendations to be accepted by a specific predictor. Instead, we leverage causal knowledge to design decision systems that predict accurately pre- and post-recourse. As a result, improvement guarantees translate into acceptance guarantees. We demonstrate that given correct causal knowledge, ICR, in contrast to existing approaches, guides towards both acceptance and improvement.

MLJun 11, 2022
Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena

Timo Freiesleben, Gunnar König, Christoph Molnar et al.

To learn about real world phenomena, scientists have traditionally used models with clearly interpretable elements. However, modern machine learning (ML) models, while powerful predictors, lack this direct elementwise interpretability (e.g. neural network weights). Interpretable machine learning (IML) offers a solution by analyzing models holistically to derive interpretations. Yet, current IML research is focused on auditing ML models rather than leveraging them for scientific inference. Our work bridges this gap, presenting a framework for designing IML methods-termed 'property descriptors' -- that illuminate not just the model, but also the phenomenon it represents. We demonstrate that property descriptors, grounded in statistical learning theory, can effectively reveal relevant properties of the joint probability distribution of the observational data. We identify existing IML methods suited for scientific inference and provide a guide for developing new descriptors with quantified epistemic uncertainty. Our framework empowers scientists to harness ML models for inference, and provides directions for future IML research to support scientific understanding.

MLApr 6, 2023
Efficient SAGE Estimation via Causal Structure Learning

Christoph Luther, Gunnar König, Moritz Grosse-Wentrup

The Shapley Additive Global Importance (SAGE) value is a theoretically appealing interpretability method that fairly attributes global importance to a model's features. However, its exact calculation requires the computation of the feature's surplus performance contributions over an exponential number of feature sets. This is computationally expensive, particularly because estimating the surplus contributions requires sampling from conditional distributions. Thus, SAGE approximation algorithms only take a fraction of the feature sets into account. We propose $d$-SAGE, a method that accelerates SAGE approximation. $d$-SAGE is motivated by the observation that conditional independencies (CIs) between a feature and the model target imply zero surplus contributions, such that their computation can be skipped. To identify CIs, we leverage causal structure learning (CSL) to infer a graph that encodes (conditional) independencies in the data as $d$-separations. This is computationally more efficient because the expense of the one-time graph inference and the $d$-separation queries is negligible compared to the expense of surplus contribution evaluations. Empirically we demonstrate that $d$-SAGE enables the efficient and accurate estimation of SAGE values.

AIJun 7, 2023
Dear XAI Community, We Need to Talk! Fundamental Misconceptions in Current XAI Research

Timo Freiesleben, Gunnar König

Despite progress in the field, significant parts of current XAI research are still not on solid conceptual, ethical, or methodological grounds. Unfortunately, these unfounded parts are not on the decline but continue to grow. Many explanation techniques are still proposed without clarifying their purpose. Instead, they are advertised with ever more fancy-looking heatmaps or only seemingly relevant benchmarks. Moreover, explanation techniques are motivated with questionable goals, such as building trust, or rely on strong assumptions about the 'concepts' that deep learning algorithms learn. In this paper, we highlight and discuss these and other misconceptions in current XAI research. We also suggest steps to make XAI a more substantive area of research.

MLMay 15
Explainable AI Isn't Enough! Rethinking Algorithmic Contestability

Timo Freiesleben, Kristof Meding, Gunnar König

Machine learning systems increasingly make life-changing decisions about individuals, such as loan approvals, hiring, and cheating detection, raising a pressing question: how can individuals respond to negative decisions made by these opaque systems? While explainable artificial intelligence (XAI) has largely focused on algorithmic recourse -- helping individuals change their features to obtain a desired outcome -- the parallel problem of algorithmic contestability -- helping individuals review and correct erroneous algorithmic decisions -- has received far less attention, despite its central ethical and legal importance. We trace this neglect to the absence of clear formal definitions and a systematic operationalization of contestability as an algorithmic problem. To address it, we propose an operational definition of contestability as a natural complement to recourse: contestability starts from the presumption that a decision may be incorrect and focuses on identifying evidence to challenge and potentially overturn it, whereas recourse assumes the decision is valid and instead provides pathways for changing it. We show that standard XAI explanations, such as counterfactuals, LIME, or Anchors, even when combined with human intuitions about decision continuity or monotonicity, reveal only errors in the neighborhood of the individual, but provide insufficient grounds for overturning the decision at hand. Going thus beyond traditional XAI, we identify three types of evidence warranting reversal according to the decision maker's own ethical standards: predictive multiplicity, incorrect feature values, and neglected overruling evidence. We argue that these render decisions normatively indefensible and thus successfully contestable. Finally, we analyze how existing EU legislation connects to our framework and argue that individuals already hold some legal rights to these forms of evidence.

MLApr 19, 2024
A Guide to Feature Importance Methods for Scientific Inference

Fiona Katharina Ewald, Ludwig Bothmann, Marvin N. Wright et al.

While machine learning (ML) models are increasingly used due to their high predictive power, their use in understanding the data-generating process (DGP) is limited. Understanding the DGP requires insights into feature-target associations, which many ML models cannot directly provide due to their opaque internal mechanisms. Feature importance (FI) methods provide useful insights into the DGP under certain conditions. Since the results of different FI methods have different interpretations, selecting the correct FI method for a concrete use case is crucial and still requires expert knowledge. This paper serves as a comprehensive guide to help understand the different interpretations of global FI methods. Through an extensive review of FI methods and providing new proofs regarding their interpretation, we facilitate a thorough understanding of these methods and formulate concrete recommendations for scientific inference. We conclude by discussing options for FI uncertainty estimation and point to directions for future research aiming at full statistical inference from black-box ML models.

MLApr 4, 2024
CountARFactuals -- Generating plausible model-agnostic counterfactual explanations with adversarial random forests

Susanne Dandl, Kristin Blesch, Timo Freiesleben et al.

Counterfactual explanations elucidate algorithmic decisions by pointing to scenarios that would have led to an alternative, desired outcome. Giving insight into the model's behavior, they hint users towards possible actions and give grounds for contesting decisions. As a crucial factor in achieving these goals, counterfactuals must be plausible, i.e., describing realistic alternative scenarios within the data manifold. This paper leverages a recently developed generative modeling technique -- adversarial random forests (ARFs) -- to efficiently generate plausible counterfactuals in a model-agnostic way. ARFs can serve as a plausibility measure or directly generate counterfactual explanations. Our ARF-based approach surpasses the limitations of existing methods that aim to generate plausible counterfactual explanations: It is easy to train and computationally highly efficient, handles continuous and categorical data naturally, and allows integrating additional desiderata such as sparsity in a straightforward manner.

LGOct 31, 2024
Disentangling Interactions and Dependencies in Feature Attribution

Gunnar König, Eric Günther, Ulrike von Luxburg

In explainable machine learning, global feature importance methods try to determine how much each individual feature contributes to predicting the target variable, resulting in one importance score for each feature. But often, predicting the target variable requires interactions between several features (such as in the XOR function), and features might have complex statistical dependencies that allow to partially replace one feature with another one. In commonly used feature importance scores these cooperative effects are conflated with the features' individual contributions, making them prone to misinterpretations. In this work, we derive DIP, a new mathematical decomposition of individual feature importance scores that disentangles three components: the standalone contribution and the contributions stemming from interactions and dependencies. We prove that the DIP decomposition is unique and show how it can be estimated in practice. Based on these results, we propose a new visualization of feature importance scores that clearly illustrates the different contributions.

MLJun 18, 2025
Performative Validity of Recourse Explanations

Gunnar König, Hidde Fokkema, Timo Freiesleben et al.

When applicants get rejected by an algorithmic decision system, recourse explanations provide actionable suggestions for how to change their input features to get a positive evaluation. A crucial yet overlooked phenomenon is that recourse explanations are performative: When many applicants act according to their recommendations, their collective behavior may change statistical regularities in the data and, once the model is refitted, also the decision boundary. Consequently, the recourse algorithm may render its own recommendations invalid, such that applicants who make the effort of implementing their recommendations may be rejected again when they reapply. In this work, we formally characterize the conditions under which recourse explanations remain valid under performativity. A key finding is that recourse actions may become invalid if they are influenced by or if they intervene on non-causal variables. Based on our analysis, we caution against the use of standard counterfactual explanations and causal recourse methods, and instead advocate for recourse methods that recommend actions exclusively on causal variables.

MLSep 3, 2021
Relating the Partial Dependence Plot and Permutation Feature Importance to the Data Generating Process

Christoph Molnar, Timo Freiesleben, Gunnar König et al.

Scientists and practitioners increasingly rely on machine learning to model data and draw conclusions. Compared to statistical modeling approaches, machine learning makes fewer explicit assumptions about data structures, such as linearity. However, their model parameters usually cannot be easily related to the data generating process. To learn about the modeled relationships, partial dependence (PD) plots and permutation feature importance (PFI) are often used as interpretation methods. However, PD and PFI lack a theory that relates them to the data generating process. We formalize PD and PFI as statistical estimators of ground truth estimands rooted in the data generating process. We show that PD and PFI estimates deviate from this ground truth due to statistical biases, model variance and Monte Carlo approximation errors. To account for model variance in PD and PFI estimation, we propose the learner-PD and the learner-PFI based on model refits, and propose corrected variance and confidence interval estimators.

MLJul 16, 2021
A Causal Perspective on Meaningful and Robust Algorithmic Recourse

Gunnar König, Timo Freiesleben, Moritz Grosse-Wentrup

Algorithmic recourse explanations inform stakeholders on how to act to revert unfavorable predictions. However, in general ML models do not predict well in interventional distributions. Thus, an action that changes the prediction in the desired way may not lead to an improvement of the underlying target. Such recourse is neither meaningful nor robust to model refits. Extending the work of Karimi et al. (2021), we propose meaningful algorithmic recourse (MAR) that only recommends actions that improve both prediction and target. We justify this selection constraint by highlighting the differences between model audit and meaningful, actionable recourse explanations. Additionally, we introduce a relaxation of MAR called effective algorithmic recourse (EAR), which, under certain assumptions, yields meaningful recourse by only allowing interventions on causes of the target.

MLJun 15, 2021
Decomposition of Global Feature Importance into Direct and Associative Components (DEDACT)

Gunnar König, Timo Freiesleben, Bernd Bischl et al.

Global model-agnostic feature importance measures either quantify whether features are directly used for a model's predictions (direct importance) or whether they contain prediction-relevant information (associative importance). Direct importance provides causal insight into the model's mechanism, yet it fails to expose the leakage of information from associated but not directly used variables. In contrast, associative importance exposes information leakage but does not provide causal insight into the model's mechanism. We introduce DEDACT - a framework to decompose well-established direct and associative importance measures into their respective associative and direct components. DEDACT provides insight into both the sources of prediction-relevant information in the data and the direct and indirect feature pathways by which the information enters the model. We demonstrate the method's usefulness on simulated examples.

MLJul 16, 2020
Relative Feature Importance

Gunnar König, Christoph Molnar, Bernd Bischl et al.

Interpretable Machine Learning (IML) methods are used to gain insight into the relevance of a feature of interest for the performance of a model. Commonly used IML methods differ in whether they consider features of interest in isolation, e.g., Permutation Feature Importance (PFI), or in relation to all remaining feature variables, e.g., Conditional Feature Importance (CFI). As such, the perturbation mechanisms inherent to PFI and CFI represent extreme reference points. We introduce Relative Feature Importance (RFI), a generalization of PFI and CFI that allows for a more nuanced feature importance computation beyond the PFI versus CFI dichotomy. With RFI, the importance of a feature relative to any other subset of features can be assessed, including variables that were not available at training time. We derive general interpretation rules for RFI based on a detailed theoretical analysis of the implications of relative feature relevance, and demonstrate the method's usefulness on simulated examples.

MLJul 8, 2020
General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models

Christoph Molnar, Gunnar König, Julia Herbinger et al.

An increasing number of model-agnostic interpretation techniques for machine learning (ML) models such as partial dependence plots (PDP), permutation feature importance (PFI) and Shapley values provide insightful model interpretations, but can lead to wrong conclusions if applied incorrectly. We highlight many general pitfalls of ML model interpretation, such as using interpretation techniques in the wrong context, interpreting models that do not generalize well, ignoring feature dependencies, interactions, uncertainty estimates and issues in high-dimensional settings, or making unjustified causal interpretations, and illustrate them with examples. We focus on pitfalls for global methods that describe the average model behavior, but many pitfalls also apply to local methods that explain individual predictions. Our paper addresses ML practitioners by raising awareness of pitfalls and identifying solutions for correct model interpretation, but also addresses ML researchers by discussing open issues for further research.

MLJun 8, 2020
Model-agnostic Feature Importance and Effects with Dependent Features -- A Conditional Subgroup Approach

Christoph Molnar, Gunnar König, Bernd Bischl et al.

The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this shift in perspective and in order to enable correct interpretations, it is therefore important that the conditioning is transparent and humanly comprehensible. In this paper, we propose a new sampling mechanism for the conditional distribution based on permutations in conditional subgroups. As these subgroups are constructed using decision trees (transformation trees), the conditioning becomes inherently interpretable. This not only provides a simple and effective estimator of conditional PFI, but also local PFI estimates within the subgroups. In addition, we apply the conditional subgroups approach to partial dependence plots (PDP), a popular method for describing feature effects that can also suffer from extrapolation when features are dependent and interactions are present in the model. We show that PFI and PDP based on conditional subgroups often outperform methods such as conditional PFI based on knockoffs, or accumulated local effect plots. Furthermore, our approach allows for a more fine-grained interpretation of feature effects and importance within the conditional subgroups.