LGAug 23, 2022
Evaluating Machine Unlearning via Epistemic UncertaintyAlexander Becker, Thomas Liebig
There has been a growing interest in Machine Unlearning recently, primarily due to legal requirements such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act. Thus, multiple approaches were presented to remove the influence of specific target data points from a trained model. However, when evaluating the success of unlearning, current approaches either use adversarial attacks or compare their results to the optimal solution, which usually incorporates retraining from scratch. We argue that both ways are insufficient in practice. In this work, we present an evaluation metric for Machine Unlearning algorithms based on epistemic uncertainty. This is the first definition of a general evaluation metric for Machine Unlearning to our best knowledge.
LGAug 31, 2023
Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural NetworksAndreas Roth, Thomas Liebig
Our study reveals new theoretical insights into over-smoothing and feature over-correlation in graph neural networks. Specifically, we demonstrate that with increased depth, node representations become dominated by a low-dimensional subspace that depends on the aggregation function but not on the feature transformations. For all aggregation functions, the rank of the node representations collapses, resulting in over-smoothing for particular aggregation functions. Our study emphasizes the importance for future research to focus on rank collapse rather than over-smoothing. Guided by our theory, we propose a sum of Kronecker products as a beneficial property that provably prevents over-smoothing, over-correlation, and rank collapse. We empirically demonstrate the shortcomings of existing models in fitting target functions of node classification tasks.
LGJul 1, 2022
Transforming PageRank into an Infinite-Depth Graph Neural NetworkAndreas Roth, Thomas Liebig
Popular graph neural networks are shallow models, despite the success of very deep architectures in other application domains of deep learning. This reduces the modeling capacity and leaves models unable to capture long-range relationships. The primary reason for the shallow design results from over-smoothing, which leads node states to become more similar with increased depth. We build on the close connection between GNNs and PageRank, for which personalized PageRank introduces the consideration of a personalization vector. Adopting this idea, we propose the Personalized PageRank Graph Neural Network (PPRGNN), which extends the graph convolutional network to an infinite-depth model that has a chance to reset the neighbor aggregation back to the initial state in each iteration. We introduce a nicely interpretable tweak to the chance of resetting and prove the convergence of our approach to a unique solution without placing any constraints, even when taking infinitely many neighbor aggregations. As in personalized PageRank, our result does not suffer from over-smoothing. While doing so, time complexity remains linear while we keep memory complexity constant, independently of the depth of the network, making it scale well to large graphs. We empirically show the effectiveness of our approach for various node and graph classification tasks. PPRGNN outperforms comparable methods in almost all cases.
LGNov 21, 2022
Forecasting Unobserved Node States with spatio-temporal Graph Neural NetworksAndreas Roth, Thomas Liebig
Forecasting future states of sensors is key to solving tasks like weather prediction, route planning, and many others when dealing with networks of sensors. But complete spatial coverage of sensors is generally unavailable and would practically be infeasible due to limitations in budget and other resources during deployment and maintenance. Currently existing approaches using machine learning are limited to the spatial locations where data was observed, causing limitations to downstream tasks. Inspired by the recent surge of Graph Neural Networks for spatio-temporal data processing, we investigate whether these can also forecast the state of locations with no sensors available. For this purpose, we develop a framework, named Forecasting Unobserved Node States (FUNS), that allows forecasting the state at entirely unobserved locations based on spatio-temporal correlations and the graph inductive bias. FUNS serves as a blueprint for optimizing models only on observed data and demonstrates good generalization capabilities for predicting the state at entirely unobserved locations during the testing stage. Our framework can be combined with any spatio-temporal Graph Neural Network, that exploits spatio-temporal correlations with surrounding observed locations by using the network's graph structure. Our employed model builds on a previous model by also allowing us to exploit prior knowledge about locations of interest, e.g. the road type. Our empirical evaluation of both simulated and real-world datasets demonstrates that Graph Neural Networks are well-suited for this task.
LGAug 31, 2023
Curvature-based Pooling within Graph Neural NetworksCedric Sanders, Andreas Roth, Thomas Liebig
Over-squashing and over-smoothing are two critical issues, that limit the capabilities of graph neural networks (GNNs). While over-smoothing eliminates the differences between nodes making them indistinguishable, over-squashing refers to the inability of GNNs to propagate information over long distances, as exponentially many node states are squashed into fixed-size representations. Both phenomena share similar causes, as both are largely induced by the graph topology. To mitigate these problems in graph classification tasks, we propose CurvPool, a novel pooling method. CurvPool exploits the notion of curvature of a graph to adaptively identify structures responsible for both over-smoothing and over-squashing. By clustering nodes based on the Balanced Forman curvature, CurvPool constructs a graph with a more suitable structure, allowing deeper models and the combination of distant information. We compare it to other state-of-the-art pooling approaches and establish its competitiveness in terms of classification accuracy, computational complexity, and flexibility. CurvPool outperforms several comparable methods across all considered tasks. The most consistent results are achieved by pooling densely connected clusters using the sum aggregation, as this allows additional information about the size of each pool.
LGOct 4, 2022
Certified Data Removal in Sum-Product NetworksAlexander Becker, Thomas Liebig
Data protection regulations like the GDPR or the California Consumer Privacy Act give users more control over the data that is collected about them. Deleting the collected data is often insufficient to guarantee data privacy since it is often used to train machine learning models, which can expose information about the training data. Thus, a guarantee that a trained model does not expose information about its training data is additionally needed. In this paper, we present UnlearnSPN -- an algorithm that removes the influence of single data points from a trained sum-product network and thereby allows fulfilling data privacy requirements on demand.
LGJan 15, 2023
Distributed LSTM-Learning from Differentially Private Label ProportionsTimon Sachweh, Daniel Boiar, Thomas Liebig
Data privacy and decentralised data collection has become more and more popular in recent years. In order to solve issues with privacy, communication bandwidth and learning from spatio-temporal data, we will propose two efficient models which use Differential Privacy and decentralized LSTM-Learning: One, in which a Long Short Term Memory (LSTM) model is learned for extracting local temporal node constraints and feeding them into a Dense-Layer (LabelProportionToLocal). The other approach extends the first one by fetching histogram data from the neighbors and joining the information with the LSTM output (LabelProportionToDense). For evaluation two popular datasets are used: Pems-Bay and METR-LA. Additionally, we provide an own dataset, which is based on LuST. The evaluation will show the tradeoff between performance and data privacy.
LGDec 27, 2022
LOSDD: Leave-Out Support Vector Data Description for Outlier DetectionDaniel Boiar, Thomas Liebig, Erich Schubert
Support Vector Machines have been successfully used for one-class classification (OCSVM, SVDD) when trained on clean data, but they work much worse on dirty data: outliers present in the training data tend to become support vectors, and are hence considered "normal". In this article, we improve the effectiveness to detect outliers in dirty training data with a leave-out strategy: by temporarily omitting one candidate at a time, this point can be judged using the remaining data only. We show that this is more effective at scoring the outlierness of points than using the slack term of existing SVM-based approaches. Identified outliers can then be removed from the data, such that outliers hidden by other outliers can be identified, to reduce the problem of masking. Naively, this approach would require training N individual SVMs (and training $O(N^2)$ SVMs when iteratively removing the worst outliers one at a time), which is prohibitively expensive. We will discuss that only support vectors need to be considered in each step and that by reusing SVM parameters and weights, this incremental retraining can be accelerated substantially. By removing candidates in batches, we can further improve the processing time, although it obviously remains more costly than training a single SVM.
LGDec 9, 2025
Graph Deep Learning for Intracranial Aneurysm Blood Flow Simulation and Risk AssessmentPaul Garnier, Pablo Jeken-Rico, Vincent Lannelongue et al.
Intracranial aneurysms remain a major cause of neurological morbidity and mortality worldwide, where rupture risk is tightly coupled to local hemodynamics particularly wall shear stress and oscillatory shear index. Conventional computational fluid dynamics simulations provide accurate insights but are prohibitively slow and require specialized expertise. Clinical imaging alternatives such as 4D Flow MRI offer direct in-vivo measurements, yet their spatial resolution remains insufficient to capture the fine-scale shear patterns that drive endothelial remodeling and rupture risk while being extremely impractical and expensive. We present a graph neural network surrogate model that bridges this gap by reproducing full-field hemodynamics directly from vascular geometries in less than one minute per cardiac cycle. Trained on a comprehensive dataset of high-fidelity simulations of patient-specific aneurysms, our architecture combines graph transformers with autoregressive predictions to accurately simulate blood flow, wall shear stress, and oscillatory shear index. The model generalizes across unseen patient geometries and inflow conditions without mesh-specific calibration. Beyond accelerating simulation, our framework establishes the foundation for clinically interpretable hemodynamic prediction. By enabling near real-time inference integrated with existing imaging pipelines, it allows direct comparison with hospital phase-diagram assessments and extends them with physically grounded, high-resolution flow fields. This work transforms high-fidelity simulations from an expert-only research tool into a deployable, data-driven decision support system. Our full pipeline delivers high-resolution hemodynamic predictions within minutes of patient imaging, without requiring computational specialists, marking a step-change toward real-time, bedside aneurysm analysis.
LGSep 17, 2024
Preventing Representational Rank Collapse in MPNNs by Splitting the Computational GraphAndreas Roth, Franka Bause, Nils M. Kriege et al.
The ability of message-passing neural networks (MPNNs) to fit complex functions over graphs is limited as most graph convolutions amplify the same signal across all feature channels, a phenomenon known as rank collapse, and over-smoothing as a special case. Most approaches to mitigate over-smoothing extend common message-passing schemes, e.g., the graph convolutional network, by utilizing residual connections, gating mechanisms, normalization, or regularization techniques. Our work contrarily proposes to directly tackle the cause of this issue by modifying the message-passing scheme and exchanging different types of messages using multi-relational graphs. We identify a sufficient condition to ensure linearly independent node representations. As one instantion, we show that operating on multiple directed acyclic graphs always satisfies our condition and propose to obtain these by defining a strict partial ordering of the nodes. We conduct comprehensive experiments that confirm the benefits of operating on multi-relational graphs to achieve more informative node representations.
LGOct 2, 2023
Distilling Influences to Mitigate Prediction Churn in Graph Neural NetworksAndreas Roth, Thomas Liebig
Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. Our work explores this phenomenon in graph neural networks by investigating differences between models differing only in their initializations in their utilized features for predictions. We propose a novel metric called Influence Difference (ID) to quantify the variation in reasons used by nodes across models by comparing their influence distribution. Additionally, we consider the differences between nodes with a stable and an unstable prediction, positing that both equally utilize different reasons and thus provide a meaningful gradient signal to closely match two models even when the predictions for nodes are similar. Based on our analysis, we propose to minimize this ID in Knowledge Distillation, a domain where a new model should closely match an established one. As an efficient approximation, we introduce DropDistillation (DD) that matches the output for a graph perturbed by edge deletions. Our empirical evaluation of six benchmark datasets for node classification validates the differences in utilized features. DD outperforms previous methods regarding prediction stability and overall performance in all considered Knowledge Distillation experiments.
LGSep 16, 2024
AALF: Almost Always Linear ForecastingMatthias Jakobs, Thomas Liebig
Recent works for time-series forecasting more and more leverage the high predictive power of Deep Learning models. With this increase in model complexity, however, comes a lack in understanding of the underlying model decision process, which is problematic for high-stakes application scenarios. At the same time, simple, interpretable forecasting methods such as ARIMA still perform very well, sometimes on-par, with Deep Learning approaches. We argue that simple models are good enough most of the time, and that forecasting performance could be improved by choosing a Deep Learning method only for few, important predictions, increasing the overall interpretability of the forecasting process. In this context, we propose a novel online model selection framework which learns to identify these predictions. An extensive empirical study on various real-world datasets shows that our selection methodology performs comparable to state-of-the-art online model selections methods in most cases while being significantly more interpretable. We find that almost always choosing a simple autoregressive linear model for forecasting results in competitive performance, suggesting that the need for opaque black-box models in time-series forecasting might be smaller than recent works would suggest.
LGJul 5, 2024
Using Petri Nets as an Integrated Constraint Mechanism for Reinforcement Learning TasksTimon Sachweh, Pierre Haritz, Thomas Liebig
The lack of trust in algorithms is usually an issue when using Reinforcement Learning (RL) agents for control in real-world domains such as production plants, autonomous vehicles, or traffic-related infrastructure, partly due to the lack of verifiability of the model itself. In such scenarios, Petri nets (PNs) are often available for flowcharts or process steps, as they are versatile and standardized. In order to facilitate integration of RL models and as a step towards increasing AI trustworthiness, we propose an approach that uses PNs with three main advantages over typical RL approaches: Firstly, the agent can now easily be modeled with a combined state including both external environmental observations and agent-specific state information from a given PN. Secondly, we can enforce constraints for state-dependent actions through the inherent PN model. And lastly, we can increase trustworthiness by verifying PN properties through techniques such as model checking. We test our approach on a typical four-way intersection traffic light control setting and present our results, beating cycle-based baselines.
ROJul 5, 2024
Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based ShieldingPierre Haritz, David Wanke, Thomas Liebig
Navigating unsignalized intersections in urban environments poses a complex challenge for self-driving vehicles, where issues such as view obstructions, unpredictable pedestrian crossings, and diverse traffic participants demand a great focus on crash prevention. In this paper, we propose a novel state representation for Reinforcement Learning (RL) agents centered around the information perceivable by an autonomous agent, enabling the safe navigation of previously uncharted road maps. Our approach surpasses several baseline models by a sig nificant margin in terms of safety and energy consumption metrics. These improvements are achieved while maintaining a competitive average travel speed. Our findings pave the way for more robust and reliable autonomous navigation strategies, promising safer and more efficient urban traffic environments.
AIJan 21, 2025
Bridging the Communication Gap: Evaluating AI Labeling Practices for Trustworthy AI DevelopmentRaphael Fischer, Magdalena Wischnewski, Alexander van der Staay et al.
As artificial intelligence (AI) becomes integral to economy and society, communication gaps between developers, users, and stakeholders hinder trust and informed decision-making. High-level AI labels, inspired by frameworks like EU energy labels, have been proposed to make the properties of AI models more transparent. Without requiring deep technical expertise, they can inform on the trade-off between predictive performance and resource efficiency. However, the practical benefits and limitations of AI labeling remain underexplored. This study evaluates AI labeling through qualitative interviews along four key research questions. Based on thematic analysis and inductive coding, we found a broad range of practitioners to be interested in AI labeling (RQ1). They see benefits for alleviating communication gaps and aiding non-expert decision-makers, however limitations, misunderstandings, and suggestions for improvement were also discussed (RQ2). Compared to other reporting formats, interviewees positively evaluated the reduced complexity of labels, increasing overall comprehensibility (RQ3). Trust was influenced most by usability and the credibility of the responsible labeling authority, with mixed preferences for self-certification versus third-party certification (RQ4). Our Insights highlight that AI labels pose a trade-off between simplicity and complexity, which could be resolved by developing customizable and interactive labeling frameworks to address diverse user needs. Transparent labeling of resource efficiency also nudged interviewee priorities towards paying more attention to sustainability aspects during AI development. This study validates AI labels as a valuable tool for enhancing trust and communication in AI, offering actionable guidelines for their refinement and standardization.
CYNov 8, 2024
ICE-T: A Multi-Faceted Concept for Teaching Machine LearningHendrik Krone, Pierre Haritz, Thomas Liebig
The topics of Artificial intelligence (AI) and especially Machine Learning (ML) are increasingly making their way into educational curricula. To facilitate the access for students, a variety of platforms, visual tools, and digital games are already being used to introduce ML concepts and strengthen the understanding of how AI works. We take a look at didactic principles that are employed for teaching computer science, define criteria, and, based on those, evaluate a selection of prominent existing platforms, tools, and games. Additionally, we criticize the approach of portraying ML mostly as a black-box and the resulting missing focus on creating an understanding of data, algorithms, and models that come with it. To tackle this issue, we present a concept that covers intermodal transfer, computational and explanatory thinking, ICE-T, as an extension of known didactic principles. With our multi-faceted concept, we believe that planners of learning units, creators of learning platforms and educators can improve on teaching ML.
LGMay 21, 2019
Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic ArchitecturesXiaoting Shao, Alejandro Molina, Antonio Vergari et al.
Probabilistic graphical models are a central tool in AI; however, they are generally not as expressive as deep neural models, and inference is notoriously hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions in a tractable fashion, but still lack the expressive power of intractable models based on deep neural networks. Therefore, we introduce conditional SPNs (CSPNs), conditional density estimators for multivariate and potentially hybrid domains which allow harnessing the expressive power of neural networks while still maintaining tractability guarantees. One way to implement CSPNs is to use an existing SPN structure and condition its parameters on the input, e.g., via a deep neural network. This approach, however, might misrepresent the conditional independence structure present in data. Consequently, we also develop a structure-learning approach that derives both the structure and parameters of CSPNs from data. Our experimental evidence demonstrates that CSPNs are competitive with other probabilistic models and yield superior performance on multilabel image classification compared to mean field and mixture density networks. Furthermore, they can successfully be employed as building blocks for structured probabilistic models, such as autoregressive image models.