LGMay 18
TOAST: Transformer Optimization using Adaptive and Simple TransformationsIrene Cannistraci, Simone Antonelli, Emanuele Palumbo et al.
Foundation models achieve state-of-the-art performance across different tasks, but their size and computational demands raise concerns about accessibility and sustainability. Existing efficiency methods often require additional retraining or finetuning, limiting their practicality. Recent findings suggest that deep neural networks exhibit internal representation similarities. While such similarities across different models have been exploited for enabling techniques such as model stitching and merging, intra-network redundancy remains underexplored as a source for efficiency gains. In this paper, we introduce Transformer Optimization using Adaptive and Simple Transformations (TOAST), a framework that exploits these redundancies to approximate entire transformer blocks with lightweight closed-form mappings, such as linear transformations or even the identity function, without any additional training. Across state-of-the-art pretrained vision models (e.g., ViT, DINOv2, DeiT) and datasets ranging from MNIST to ImageNet-1k, TOAST reduces parameters and computation while preserving, and in some cases improving, downstream performance. These results show that large portions of transformer depth can be replaced by trivial functions, opening a new perspective on efficient foundation models.
LGJun 15, 2022
Taxonomy of Benchmarks in Graph Representation LearningRenming Liu, Semih Cantürk, Frederik Wenkel et al. · mila
Graph Neural Networks (GNNs) extend the success of neural networks to graph-structured data by accounting for their intrinsic geometry. While extensive research has been done on developing GNN models with superior performance according to a collection of graph representation learning benchmarks, it is currently not well understood what aspects of a given model are probed by them. For example, to what extent do they test the ability of a model to leverage graph structure vs. node features? Here, we develop a principled approach to taxonomize benchmarking datasets according to a $\textit{sensitivity profile}$ that is based on how much GNN performance changes due to a collection of graph perturbations. Our data-driven analysis provides a deeper understanding of which benchmarking data characteristics are leveraged by GNNs. Consequently, our taxonomy can aid in selection and development of adequate graph benchmarks, and better informed evaluation of future GNN methods. Finally, our approach and implementation in $\texttt{GTaxoGym}$ package are extendable to multiple graph prediction task types and future datasets.
CVAug 30, 2022
A Diffusion Model Predicts 3D Shapes from 2D Microscopy ImagesDominik J. E. Waibel, Ernst Röell, Bastian Rieck et al.
Diffusion models are a special type of generative model, capable of synthesising new data from a learnt distribution. We introduce DISPR, a diffusion-based model for solving the inverse problem of three-dimensional (3D) cell shape prediction from two-dimensional (2D) single cell microscopy images. Using the 2D microscopy image as a prior, DISPR is conditioned to predict realistic 3D shape reconstructions. To showcase the applicability of DISPR as a data augmentation tool in a feature-based single cell classification task, we extract morphological features from the red blood cells grouped into six highly imbalanced classes. Adding features from the DISPR predictions to the three minority classes improved the macro F1 score from $F1_\text{macro} = 55.2 \pm 4.6\%$ to $F1_\text{macro} = 72.2 \pm 4.9\%$. We thus demonstrate that diffusion models can be successfully applied to inverse biomedical problems, and that they learn to reconstruct 3D shapes with realistic morphological features from 2D microscopy images.
LGMar 28, 2022
Time-inhomogeneous diffusion geometry and topologyGuillaume Huguet, Alexander Tong, Bastian Rieck et al. · mila
Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of high-dimensional data. Diffusion condensation is constructed as a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show diffusion condensation generalizes centroid-based hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic condensation homology. We use this intrinsic topology as well as the ambient persistent homology of the condensation process to study how the data changes over diffusion time. We demonstrate both types of topological information in well-understood toy examples. Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.
LGMay 29
Graph Neural Networks Are Not Continuous Across Graph ResolutionsChristian Koke, Yuesong Shen, Abhishek Saroha et al.
We show that contrary to conventional wisdom in the community, graph neural networks (GNNs) are not continuous with respect to all natural modes of graph convergence. As a result, GNNs may generate substantially different latent representations for graphs that are very similar. In particular they assign vastly different latent embeddings to graphs that represent the same underlying object at different resolution scales. We trace this failure of continuity back to a structural obstruction arising from commonly used information-propagation schemes. Building on this insight we then derive a principled modification to standard GNN architectures which equips models with continuity across scales. The proposed modification enables consistent integration of distinct resolutions and reliable generalization between them. We systematically validate our theoretical findings in a wide range of numerical experiments.
LGJan 30, 2023
Curvature Filtrations for Graph Generative Model EvaluationJoshua Southern, Jeremy Wayland, Michael Bronstein et al.
Graph generative model evaluation necessitates understanding differences between graphs on the distributional level. This entails being able to harness salient attributes of graphs in an efficient manner. Curvature constitutes one such property that has recently proved its utility in characterising graphs. Its expressive properties, stability, and practical utility in model evaluation remain largely unexplored, however. We combine graph curvature descriptors with emerging methods from topological data analysis to obtain robust, expressive descriptors for evaluating graph generative models.
CVMar 3, 2022
Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D ReconstructionDominik J. E. Waibel, Scott Atwell, Matthias Meier et al.
Reconstructing 3D objects from 2D images is both challenging for our brains and machine learning algorithms. To support this spatial reasoning task, contextual information about the overall shape of an object is critical. However, such information is not captured by established loss terms (e.g. Dice loss). We propose to complement geometrical shape information by including multi-scale topological features, such as connected components, cycles, and voids, in the reconstruction loss. Our method uses cubical complexes to calculate topological features of 3D volume data and employs an optimal transport distance to guide the reconstruction process. This topology-aware loss is fully differentiable, computationally efficient, and can be added to any neural network. We demonstrate the utility of our loss by incorporating it into SHAPR, a model for predicting the 3D cell shape of individual cells based on 2D microscopy images. Using a hybrid loss that leverages both geometrical and topological information of single objects to assess their shape, we find that topological information substantially improves the quality of reconstructions, thus highlighting its ability to extract more relevant features from image datasets.
LGFeb 20, 2023
On the Expressivity of Persistent Homology in Graph LearningRubén Ballester, Bastian Rieck
Persistent homology, a technique from computational topology, has recently shown strong empirical performance in the context of graph classification. Being able to capture long range graph properties via higher-order topological features, such as cycles of arbitrary length, in combination with multi-scale topological descriptors, has improved predictive performance for data sets with prominent topological structures, such as molecules. At the same time, the theoretical properties of persistent homology have not been formally assessed in this context. This paper intends to bridge the gap between computational topology and graph machine learning by providing a brief introduction to persistent homology in the context of graphs, as well as a theoretical discussion and empirical analysis of its expressivity for graph learning tasks.
LGOct 21, 2022
Ollivier-Ricci Curvature for Hypergraphs: A Unified FrameworkCorinna Coupette, Sebastian Dalleiger, Bastian Rieck
Bridging geometry and topology, curvature is a powerful and expressive invariant. While the utility of curvature has been theoretically and empirically confirmed in the context of manifolds and graphs, its generalization to the emerging domain of hypergraphs has remained largely unexplored. On graphs, the Ollivier-Ricci curvature measures differences between random walks via Wasserstein distances, thus grounding a geometric concept in ideas from probability theory and optimal transport. We develop ORCHID, a flexible framework generalizing Ollivier-Ricci curvature to hypergraphs, and prove that the resulting curvatures have favorable theoretical properties. Through extensive experiments on synthetic and real-world hypergraphs from different domains, we demonstrate that ORCHID curvatures are both scalable and useful to perform a variety of hypergraph tasks in practice.
LGOct 11, 2023
Differentiable Euler Characteristic Transforms for Shape ClassificationErnst Roell, Bastian Rieck
The Euler Characteristic Transform (ECT) has proven to be a powerful representation, combining geometrical and topological characteristics of shapes and graphs. However, the ECT was hitherto unable to learn task-specific representations. We overcome this issue and develop a novel computational layer that enables learning the ECT in an end-to-end fashion. Our method, the Differentiable Euler Characteristic Transform (DECT), is fast and computationally efficient, while exhibiting performance on a par with more complex models in both graph and point cloud classification tasks. Moreover, we show that this seemingly simple statistic provides the same topological expressivity as more complex topological deep learning layers.
LGSep 30, 2022
Topological Singularity Detection at Multiple ScalesJulius von Rohrscheidt, Bastian Rieck
The manifold hypothesis, which assumes that data lies on or close to an unknown manifold of low intrinsic dimension, is a staple of modern machine learning research. However, recent work has shown that real-world data exhibits distinct non-manifold structures, i.e. singularities, that can lead to erroneous findings. Detecting such singularities is therefore crucial as a precursor to interpolation and inference tasks. We address this issue by developing a topological framework that (i) quantifies the local intrinsic dimension, and (ii) yields a Euclidicity score for assessing the 'manifoldness' of a point along multiple scales. Our approach identifies singularities of complex spaces, while also capturing singular structures and local geometric complexity in image data.
LGMar 8, 2023
Euler Characteristic Transform Based Topological Loss for Reconstructing 3D Images from Single 2D SlicesKalyan Varma Nadimpalli, Amit Chattopadhyay, Bastian Rieck
The computer vision task of reconstructing 3D images, i.e., shapes, from their single 2D image slices is extremely challenging, more so in the regime of limited data. Deep learning models typically optimize geometric loss functions, which may lead to poor reconstructions as they ignore the structural properties of the shape. To tackle this, we propose a novel topological loss function based on the Euler Characteristic Transform. This loss can be used as an inductive bias to aid the optimization of any neural network toward better reconstructions in the regime of limited data. We show the effectiveness of the proposed loss function by incorporating it into SHAPR, a state-of-the-art shape reconstruction model, and test it on two benchmark datasets, viz., Red Blood Cells and Nuclei datasets. We also show a favourable property, namely injectivity and discuss the stability of the topological loss function based on the Euler Characteristic Transform.
LGJun 8, 2022
Diffusion Curvature for Estimating Local Curvature in High Dimensional DataDhananjay Bhaskar, Kincaid MacDonald, Oluwadamilola Fasina et al.
We introduce a new intrinsic measure of local curvature on point-cloud data called diffusion curvature. Our measure uses the framework of diffusion maps, including the data diffusion operator, to structure point cloud data and define local curvature based on the laziness of a random walk starting at a point or region of the data. We show that this laziness directly relates to volume comparison results from Riemannian geometry. We then extend this scalar curvature notion to an entire quadratic form using neural network estimations based on the diffusion map of point-cloud data. We show applications of both estimations on toy data, single-cell data, and on estimating local Hessian matrices of neural network loss landscapes.
LGJun 16, 2022
All the World's a (Hyper)Graph: A Data DramaCorinna Coupette, Jilles Vreeken, Bastian Rieck
We introduce Hyperbard, a dataset of diverse relational data representations derived from Shakespeare's plays. Our representations range from simple graphs capturing character co-occurrence in single scenes to hypergraphs encoding complex communication settings and character contributions as hyperedges with edge-specific node weights. By making multiple intuitive representations readily available for experimentation, we facilitate rigorous representation robustness checks in graph learning, graph mining, and network analysis, highlighting the advantages and drawbacks of specific representations. Leveraging the data released in Hyperbard, we demonstrate that many solutions to popular graph mining problems are highly dependent on the representation choice, thus calling current graph curation practices into question. As an homage to our data source, and asserting that science can also be art, we present all our points in the form of a play.
LGJun 16, 2022
On the Surprising Behaviour of node2vecCelia Hacker, Bastian Rieck
Graph embedding techniques are a staple of modern graph learning research. When using embeddings for downstream tasks such as classification, information about their stability and robustness, i.e., their susceptibility to sources of noise, stochastic effects, or specific parameter choices, becomes increasingly important. As one of the most prominent graph embedding schemes, we focus on node2vec and analyse its embedding quality from multiple perspectives. Our findings indicate that embedding quality is unstable with respect to parameter choices, and we propose strategies to remedy this in practice.
SIAug 27, 2024Code
Characterizing Physician Referral Networks with Ricci CurvatureJeremy Wayland, Russel J. Funk, Bastian Rieck
Identifying (a) systemic barriers to quality healthcare access and (b) key indicators of care efficacy in the United States remains a significant challenge. To improve our understanding of regional disparities in care delivery, we introduce a novel application of curvature, a geometrical-topological property of networks, to Physician Referral Networks. Our initial findings reveal that Forman-Ricci and Ollivier-Ricci curvature measures, which are known for their expressive power in characterizing network structure, offer promising indicators for detecting variations in healthcare efficacy while capturing a range of significant regional demographic features. We also present APPARENT, an open-source tool that leverages Ricci curvature and other network features to examine correlations between regional Physician Referral Networks structure, local census data, healthcare effectiveness, and patient outcomes.
LGSep 7, 2023
Filtration Surfaces for Dynamic Graph ClassificationFranz Srambical, Bastian Rieck
Existing approaches for classifying dynamic graphs either lift graph kernels to the temporal domain, or use graph neural networks (GNNs). However, current baselines have scalability issues, cannot handle a changing node set, or do not take edge weight information into account. We propose filtration surfaces, a novel method that is scalable and flexible, to alleviate said restrictions. We experimentally validate the efficacy of our model and show that filtration surfaces outperform previous state-of-the-art baselines on datasets that rely on edge weight information. Our method does so while being either completely parameter-free or having at most one parameter, and yielding the lowest overall standard deviation among similarly scalable methods.
LGNov 27, 2023
Metric Space Magnitude for Evaluating the Diversity of Latent RepresentationsKatharina Limbeck, Rayna Andreeva, Rik Sarkar et al.
The magnitude of a metric space is a novel invariant that provides a measure of the 'effective size' of a space across multiple scales, while also capturing numerous geometrical properties, such as curvature, density, or entropy. We develop a family of magnitude-based measures of the intrinsic diversity of latent representations, formalising a novel notion of dissimilarity between magnitude functions of finite metric spaces. Our measures are provably stable under perturbations of the data, can be efficiently calculated, and enable a rigorous multi-scale characterisation and comparison of latent representations. We show their utility and superior performance across different domains and tasks, including (i) the automated estimation of diversity, (ii) the detection of mode collapse, and (iii) the evaluation of generative models for text, image, and graph data.
LGJun 1, 2023
Evaluating the "Learning on Graphs" Conference ExperienceBastian Rieck, Corinna Coupette
With machine learning conferences growing ever larger, and reviewing processes becoming increasingly elaborate, more data-driven insights into their workings are required. In this report, we present the results of a survey accompanying the first "Learning on Graphs" (LoG) Conference. The survey was directed to evaluate the submission and review process from different perspectives, including authors, reviewers, and area chairs alike.
LGSep 12, 2024
CliquePH: Higher-Order Information for Graph Neural Networks through Persistent Homology on Clique GraphsDavide Buffelli, Farzin Soleymani, Bastian Rieck
Graph neural networks have become the default choice by practitioners for graph learning tasks such as graph classification and node classification. Nevertheless, popular graph neural network models still struggle to capture higher-order information, i.e., information that goes \emph{beyond} pairwise interactions. Recent work has shown that persistent homology, a tool from topological data analysis, can enrich graph neural networks with topological information that they otherwise could not capture. Calculating such features is efficient for dimension 0 (connected components) and dimension 1 (cycles). However, when it comes to higher-order structures, it does not scale well, with a complexity of $O(n^d)$, where $n$ is the number of nodes and $d$ is the order of the structures. In this work, we introduce a novel method that extracts information about higher-order structures in the graph while still using the efficient low-dimensional persistent homology algorithm. On standard benchmark datasets, we show that our method can lead to up to $31\%$ improvements in test accuracy.
ATAug 21, 2024
Persistent Homology via EllipsoidsNiklas Canova, Sara Kališnik, Aaron Moser et al.
Persistent homology is one of the most popular methods in topological data analysis. An initial step in its use involves constructing a nested sequence of simplicial complexes. There is an abundance of different complexes to choose from, with Čech, Rips, alpha, and witness complexes being popular choices. In this manuscript, we build a novel type of geometrically informed simplicial complex, called a Rips-type ellipsoid complex. This complex is based on the idea that ellipsoids aligned with tangent directions better approximate the data compared to conventional (Euclidean) balls centered at sample points, as used in the construction of Rips and Alpha complexes. We use Principal Component Analysis to estimate tangent spaces directly from samples and present an algorithm for computing Rips-type ellipsoid barcodes, i.e., topological descriptors based on Rips-type ellipsoid complexes. Additionally, we show that the ellipsoid barcodes depend continuously on the input data so that small perturbations of a k-generic point cloud lead to proportionally small changes in the resulting ellipsoid barcodes. This provides a theoretical guarantee analogous, if somewhat weaker, to the classical stability results for Rips and Čech filtrations. We also conduct extensive experiments and compare Rips-type ellipsoid barcodes with standard Rips barcodes. Our findings indicate that Rips-type ellipsoid complexes are particularly effective for estimating the homology of manifolds and spaces with bottlenecks from samples. In particular, the persistence intervals corresponding to ground-truth topological features are longer compared to those obtained using the Rips complex of the data. Furthermore, Rips-type ellipsoid barcodes lead to better classification results in sparsely sampled point clouds. Finally, we demonstrate that Rips-type ellipsoid barcodes outperform Rips barcodes in classification tasks.
LGNov 4, 2025
Homomorphism distortion: A metric to distinguish them all and in the latent space bind themMartin Carrasco, Olga Zaghen, Erik Bekkers et al.
For far too long, expressivity of graph neural networks has been measured \emph{only} in terms of combinatorial properties. In this work we stray away from this tradition and provide a principled way to measure similarity between vertex attributed graphs. We denote this measure as the \emph{graph homomorphism distortion}. We show it can \emph{completely characterize} graphs and thus is also a \emph{complete graph embedding}. However, somewhere along the road, we run into the graph canonization problem. To circumvent this obstacle, we devise to efficiently compute this measure via sampling, which in expectation ensures \emph{completeness}. Additionally, we also discovered that we can obtain a metric from this measure. We validate our claims empirically and find that the \emph{graph homomorphism distortion}: (1.) fully distinguishes the \texttt{BREC} dataset with up to $4$-WL non-distinguishable graphs, and (2.) \emph{outperforms} previous methods inspired in homomorphisms under the \texttt{ZINC-12k} dataset. These theoretical results, (and their empirical validation), pave the way for future characterization of graphs, extending the graph theoretic tradition to new frontiers.
LGJul 26, 2023
Topological Inductive Bias fosters Multiple Instance Learning in Data-Scarce ScenariosSalome Kazeminia, Carsten Marr, Bastian Rieck
Multiple instance learning (MIL) is a framework for weakly supervised classification, where labels are assigned to sets of instances, i.e., bags, rather than to individual data points. This paradigm has proven effective in tasks where fine-grained annotations are unavailable or costly to obtain. However, the effectiveness of MIL drops sharply when training data are scarce, such as for rare disease classification. To address this challenge, we propose incorporating topological inductive biases into the data representation space within the MIL framework. This bias introduces a topology-preserving constraint that encourages the instance encoder to maintain the topological structure of the instance distribution within each bag when mapping them to MIL latent space. As a result, our Topology Guided MIL (TG-MIL) method enhances the performance and generalizability of MIL classifiers across different aggregation functions, especially under scarce-data regimes. Our evaluations show average performance improvements of 15.3% for synthetic MIL datasets, 2.8% for MIL benchmarks, and 5.5% for rare anemia classification compared to current state-of-the-art MIL models, where only 17-120 samples per class are available. We make our code publicly available.
LGFeb 16, 2022Code
On Measuring Excess Capacity in Neural NetworksFlorian Graf, Sebastian Zeng, Bastian Rieck et al.
We study the excess capacity of deep networks in the context of supervised classification. That is, given a capacity measure of the underlying hypothesis class - in our case, empirical Rademacher complexity - to what extent can we (a priori) constrain this class while retaining an empirical error on a par with the unconstrained regime? To assess excess capacity in modern architectures (such as residual networks), we extend and unify prior Rademacher complexity bounds to accommodate function composition and addition, as well as the structure of convolutions. The capacity-driving terms in our bounds are the Lipschitz constants of the layers and an (2, 1) group norm distance to the initializations of the convolution weights. Experiments on benchmark datasets of varying task difficulty indicate that (1) there is a substantial amount of excess capacity per task, and (2) capacity can be kept at a surprisingly similar level across tasks. Overall, this suggests a notion of compressibility with respect to weight norms, complementary to classic compression via weight pruning. Source code is available at https://github.com/rkwitt/excess_capacity.
LGMay 8
Have Graph -- Will Lift? The Case for Higher-Order BenchmarksBastian Rieck
After a somewhat rocky start, geometry and topology have established a foothold in machine learning. Message passing, either on graphs or higher-order complexes, is one of the main drivers of geometric deep learning, and paradigms that were once considered to be firmly in the realm of the abstract-like sheaves-have been "tamed" to serve as novel inductive biases for model architectures in topological deep learning. The veritable diversity of models, however, is in stark contrast to the scarcity of suitable benchmark datasets. As a result, researchers often resort to lifting existing graph datasets to include higher-order information. In this opinion paper, I want to encourage the community to also source new datasets, which may be used to prop up the foundations of our research field.
LGFeb 14, 2024
Position: Topological Deep Learning is the New Frontier for Relational LearningTheodore Papamarkou, Tolga Birdal, Michael Bronstein et al.
Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning settings. To this end, this paper discusses open problems in TDL, ranging from practical benefits to theoretical foundations. For each problem, it outlines potential solutions and future research opportunities. At the same time, this paper serves as an invitation to the scientific community to actively participate in TDL research to unlock the potential of this emerging field.
LGMay 7
No Triangulation Without Representation: Generalization in Topological Deep LearningJohannes S. Schmidt, Martin Carrasco, Ernst Röell et al.
Despite an ever-increasing interest in topological deep learning models that target higher-order datasets, there is no consensus on how to evaluate such models. This is exacerbated by the fact that topological objects permit operations, such as structural refinements, that are not appropriate for graph data. In this work, we extend MANTRA, a benchmark dataset containing manifold triangulations, to a larger class of manifolds with more diverse homeomorphism types. We show that, unlike prior claims, both graph neural networks (GNNs) and higher-order message passing (HOMP) methods can saturate the benchmark. However, we find that this is contingent on the right representation and feature assignment, emphasizing their importance in baseline models. We thus provide a novel evaluation protocol based on representational diversity and triangulation refinement. Surprisingly, we find no indication that existing models are capable of generalizing beyond the combinatorial structure of the data. This points towards a research gap in developing models that understand topological structure independent of scale. Our work thus provides the necessary scaffolding to evaluate future models and enable the development of topology-aware inductive biases.
LGMay 7
Diversity Curves for Graph Representation LearningKatharina Limbeck, Nadja Häusermann, Martin Carrasco et al.
Graph-level representations are crucial tools for characterising structural differences between graphs. However, comparing graphs with different cardinalities, even when sampled from the same underlying distribution, remains challenging. Unsupervised tasks in particular require interpretable, scalable, and reliable size-aware graph representations. Our work addresses these issues by tracking the structural diversity of a graph across coarsening levels. The resulting graph embeddings, which we denote diversity curves, are interpretable by construction, efficient, and directly comparable across coarsening hierarchies. Specifically, we track the spread of graphs, a novel isometry invariant that is inherently well-suited for encoding the metric diversity and geometry of graphs. We utilise edge contraction coarsening and prove that this improves expressivity, thus leading to more powerful graph-level representations than structural descriptors alone. Demonstrating their utility over a range of baseline methods in practice, we use diversity curves to (i) cluster and visualise simulated graphs across varying sizes, (ii) distinguish the geometry of single-cell graphs, (iii) compare the structure of molecular graph datasets, and (iv) characterise geometric shapes.
LGMay 7
Geometry-Aware Simplicial Message PassingElena Xinyi Wang, Bastian Rieck
The Weisfeiler--Lehman (WL) test and its simplicial extension (SWL) characterize the combinatorial expressivity of message passing networks, but they are blind to geometry, i.e., meshes with identical connectivity but different embeddings are indistinguishable. We introduce the Geometric Simplicial Weisfeiler--Lehman (GSWL) test, which incorporates vertex coordinates into color refinement for geometric simplicial complexes. In addition, we show that (i) the expressivity of geometry-aware simplicial message passing schemes is bounded above by GSWL, and (ii) that there exist parameters such that the discriminating power of GSWL is matched by these schemes on any fixed finite family of geometric simplicial complexes. Combined with the Euler Characteristic Transform (ECT), a complete invariant for geometric simplicial complexes, this yields a geometric expressivity characterization together with an approximation framework. Experiments on synthetic and mesh datasets serve to validate our theory, showing a clear hierarchy from combinatorial to geometry-aware models.
LGMay 7
Invariant-Based Diagnostics for Graph BenchmarksRichard von Moos, Mathieu Alain, Bastian Rieck
Progress on graph foundation models is hindered by benchmark practices that conflate the contributions of node features and graph structure, making it hard to tell whether a model actually learns from connectivity, or whether it even needs to. We propose addressing this using graph invariants, i.e., permutation-invariant, task-agnostic structural descriptors that serve as a diagnostic framework for graph benchmarks. We show that (i) invariants are more expressive than standard GNNs, (ii) invariants characterize structural heterogeneity within and across benchmark datasets, (iii) invariants predict multi-task performance, and (iv) simple invariant-based models are competitive with, and sometimes exceed, transformer and message-passing baselines across 26 datasets. Our results suggest that expressivity is not the main driver of predictive performance, and that on tasks where structure matters, a non-trainable structural proxy often matches trained message-passing models. We thus posit that invariant baselines should become a standard for evaluating whether structure is required for a task and whether a model picks up on it, serving as a stepping stone towards graph foundation models.
LGDec 13, 2023
Simplicial Representation Learning with Neural $k$-FormsKelly Maggs, Celia Hacker, Bastian Rieck
Geometric deep learning extends deep learning to incorporate information about the geometry and topology data, especially in complex domains like graphs. Despite the popularity of message passing in this field, it has limitations such as the need for graph rewiring, ambiguity in interpreting data, and over-smoothing. In this paper, we take a different approach, focusing on leveraging geometric information from simplicial complexes embedded in $\mathbb{R}^n$ using node coordinates. We use differential k-forms in \mathbb{R}^n to create representations of simplices, offering interpretability and geometric consistency without message passing. This approach also enables us to apply differential geometry tools and achieve universal approximation. Our method is efficient, versatile, and applicable to various input complexes, including graphs, simplicial complexes, and cell complexes. It outperforms existing message passing neural networks in harnessing information from geometrical graphs with node features serving as coordinates.
LGFeb 2, 2024
Mapping the Multiverse of Latent RepresentationsJeremy Wayland, Corinna Coupette, Bastian Rieck
Echoing recent calls to counter reliability and robustness concerns in machine learning via multiverse analysis, we present PRESTO, a principled framework for mapping the multiverse of machine-learning models that rely on latent representations. Although such models enjoy widespread adoption, the variability in their embeddings remains poorly understood, resulting in unnecessary complexity and untrustworthy representations. Our framework uses persistent homology to characterize the latent spaces arising from different combinations of diverse machine-learning methods, (hyper)parameter configurations, and datasets, allowing us to measure their pairwise (dis)similarity and statistically reason about their distributions. As we demonstrate both theoretically and empirically, our pipeline preserves desirable properties of collections of latent representations, and it can be leveraged to perform sensitivity analysis, detect anomalous embeddings, or efficiently and effectively navigate hyperparameter search spaces.
LGFeb 4, 2025
No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning DatasetsCorinna Coupette, Jeremy Wayland, Emily Simons et al.
Benchmark datasets have proved pivotal to the success of graph learning, and good benchmark datasets are crucial to guide the development of the field. Recent research has highlighted problems with graph-learning datasets and benchmarking practices -- revealing, for example, that methods which ignore the graph structure can outperform graph-based approaches. Such findings raise two questions: (1) What makes a good graph-learning dataset, and (2) how can we evaluate dataset quality in graph learning? Our work addresses these questions. As the classic evaluation setup uses datasets to evaluate models, it does not apply to dataset evaluation. Hence, we start from first principles. Observing that graph-learning datasets uniquely combine two modes -- graph structure and node features --, we introduce Rings, a flexible and extensible mode-perturbation framework to assess the quality of graph-learning datasets based on dataset ablations -- i.e., quantifying differences between the original dataset and its perturbed representations. Within this framework, we propose two measures -- performance separability and mode complementarity -- as evaluation tools, each assessing the capacity of a graph dataset to benchmark the power and efficacy of graph-learning methods from a distinct angle. We demonstrate the utility of our framework for dataset evaluation via extensive experiments on graph-level tasks and derive actionable recommendations for improving the evaluation of graph-learning methods. Our work opens new research directions in data-centric graph learning, and it constitutes a step toward the systematic evaluation of evaluations.
LGOct 14, 2024
Graph Classification Gaussian Processes via Hodgelet Spectral FeaturesMathieu Alain, So Takao, Xiaowen Dong et al.
The problem of classifying graphs is ubiquitous in machine learning. While it is standard to apply graph neural networks or graph kernel methods, Gaussian processes can be employed by transforming spatial features from the graph domain into spectral features in the Euclidean domain, and using them as the input points of classical kernels. However, this approach currently only takes into account features on vertices, whereas some graph datasets also support features on edges. In this work, we present a Gaussian process-based classification algorithm that can leverage one or both vertex and edges features. Furthermore, we take advantage of the Hodge decomposition to better capture the intricate richness of vertex and edge features, which can be beneficial on diverse tasks.
LGAug 21, 2025
Low-dimensional embeddings of high-dimensional dataCyril de Bodt, Alex Diaz-Papkovich, Michael Bleher et al.
Large collections of high-dimensional data have become nearly ubiquitous across many academic fields and application domains, ranging from biology to the humanities. Since working directly with high-dimensional data poses challenges, the demand for algorithms that create low-dimensional representations, or embeddings, for data visualization, exploration, and analysis is now greater than ever. In recent years, numerous embedding algorithms have been developed, and their usage has become widespread in research and industry. This surge of interest has resulted in a large and fragmented research field that faces technical challenges alongside fundamental debates, and it has left practitioners without clear guidance on how to effectively employ existing methods. Aiming to increase coherence and facilitate future work, in this review we provide a detailed and critical overview of recent developments, derive a list of best practices for creating and using low-dimensional embeddings, evaluate popular approaches on a variety of datasets, and discuss the remaining challenges and open problems in the field.
LGJul 4, 2025
Molecular Machine Learning Using Euler Characteristic TransformsVictor Toscano-Duran, Florian Rottach, Bastian Rieck
The shape of a molecule determines its physicochemical and biological properties. However, it is often underrepresented in standard molecular representation learning approaches. Here, we propose using the Euler Characteristic Transform (ECT) as a geometrical-topological descriptor. Computed directly on a molecular graph derived from handcrafted atomic features, the ECT enables the extraction of multiscale structural features, offering a novel way to represent and encode molecular shape in the feature space. We assess the predictive performance of this representation across nine benchmark regression datasets, all centered around predicting the inhibition constant $K_i$. In addition, we compare our proposed ECT-based representation against traditional molecular representations and methods, such as molecular fingerprints/descriptors and graph neural networks (GNNs). Our results show that our ECT-based representation achieves competitive performance, ranking among the best-performing methods on several datasets. More importantly, its combination with traditional representations, particularly with the AVALON fingerprint, significantly \emph{enhances predictive performance}, outperforming other methods on most datasets. These findings highlight the complementary value of multiscale topological information and its potential for being combined with established techniques. Our study suggests that hybrid approaches incorporating explicit shape information can lead to more informative and robust molecular representations, enhancing and opening new avenues in molecular machine learning tasks. To support reproducibility and foster open biomedical research, we provide open access to all experiments and code used in this work.
LGFeb 6, 2025
Principal Curvatures Estimation with Applications to Single Cell DataYanlei Zhang, Lydia Mezrag, Xingzhi Sun et al.
The rapidly growing field of single-cell transcriptomic sequencing (scRNAseq) presents challenges for data analysis due to its massive datasets. A common method in manifold learning consists in hypothesizing that datasets lie on a lower dimensional manifold. This allows to study the geometry of point clouds by extracting meaningful descriptors like curvature. In this work, we will present Adaptive Local PCA (AdaL-PCA), a data-driven method for accurately estimating various notions of intrinsic curvature on data manifolds, in particular principal curvatures for surfaces. The model relies on local PCA to estimate the tangent spaces. The evaluation of AdaL-PCA on sampled surfaces shows state-of-the-art results. Combined with a PHATE embedding, the model applied to single-cell RNA sequencing data allows us to identify key variations in the cellular differentiation.
LGOct 23, 2024
Topology meets Machine Learning: An Introduction using the Euler Characteristic TransformBastian Rieck
This overview article makes the case for how topological concepts can enrich research in machine learning. Using the Euler Characteristic Transform (ECT), a geometrical-topological invariant, as a running example, I present different use cases that result in more efficient models for analyzing point clouds, graphs, and meshes. Moreover, I outline a vision for how topological concepts could be used in the future, comprising (1) the learning of functions on topological spaces, (2) the building of hybrid models that imbue neural networks with knowledge about the topological information in data, and (3) the analysis of qualitative properties of neural networks. With current research already addressing some of these aspects, this article thus serves as an introduction and invitation to this nascent area of research.
LGOct 1, 2025
LEAP: Local ECT-Based Learnable Positional Encodings for GraphsJuan Amboage, Ernst Röell, Patrick Schnider et al.
Graph neural networks (GNNs) largely rely on the message-passing paradigm, where nodes iteratively aggregate information from their neighbors. Yet, standard message passing neural networks (MPNNs) face well-documented theoretical and practical limitations. Graph positional encoding (PE) has emerged as a promising direction to address these limitations. The Euler Characteristic Transform (ECT) is an efficiently computable geometric-topological invariant that characterizes shapes and graphs. In this work, we combine the differentiable approximation of the ECT (DECT) and its local variant ($\ell$-ECT) to propose LEAP, a new end-to-end trainable local structural PE for graphs. We evaluate our approach on multiple real-world datasets as well as on a synthetic task designed to test its ability to extract topological features. Our results underline the potential of LEAP-based encodings as a powerful component for graph representation learning pipelines.
LGSep 3, 2025
EmbedOR: Provable Cluster-Preserving Visualizations with Curvature-Based Stochastic Neighbor EmbeddingsTristan Luca Saidi, Abigail Hickok, Bastian Rieck et al.
Stochastic Neighbor Embedding (SNE) algorithms like UMAP and tSNE often produce visualizations that do not preserve the geometry of noisy and high dimensional data. In particular, they can spuriously separate connected components of the underlying data submanifold and can fail to find clusters in well-clusterable data. To address these limitations, we propose EmbedOR, a SNE algorithm that incorporates discrete graph curvature. Our algorithm stochastically embeds the data using a curvature-enhanced distance metric that emphasizes underlying cluster structure. Critically, we prove that the EmbedOR distance metric extends consistency results for tSNE to a much broader class of datasets. We also describe extensive experiments on synthetic and real data that demonstrate the visualization and geometry-preservation capabilities of EmbedOR. We find that, unlike other SNE algorithms and UMAP, EmbedOR is much less likely to fragment continuous, high-density regions of the data. Finally, we demonstrate that the EmbedOR distance metric can be used as a tool to annotate existing visualizations to identify fragmentation and provide deeper insight into the underlying geometry of the data.
LGJun 13, 2025
Geometry-Aware Edge Pooling for Graph Neural NetworksKatharina Limbeck, Lydia Mezrag, Guy Wolf et al.
Graph Neural Networks (GNNs) have shown significant success for graph-based tasks. Motivated by the prevalence of large datasets in real-world applications, pooling layers are crucial components of GNNs. By reducing the size of input graphs, pooling enables faster training and potentially better generalisation. However, existing pooling operations often optimise for the learning task at the expense of discarding fundamental graph structures, thus reducing interpretability. This leads to unreliable performance across dataset types, downstream tasks and pooling ratios. Addressing these concerns, we propose novel graph pooling layers for structure-aware pooling via edge collapses. Our methods leverage diffusion geometry and iteratively reduce a graph's size while preserving both its metric structure and its structural diversity. We guide pooling using magnitude, an isometry-invariant diversity measure, which permits us to control the fidelity of the pooling process. Further, we use the spread of a metric space as a faster and more stable alternative ensuring computational efficiency. Empirical results demonstrate that our methods (i) achieve top performance compared to alternative pooling layers across a range of diverse graph classification tasks, (ii) preserve key spectral properties of the input graphs, and (iii) retain high accuracy across varying pooling ratios.
LGNov 27, 2025
From Topology to Retrieval: Decoding Embedding Spaces with Unified SignaturesFlorian Rottach, William Rudman, Bastian Rieck et al.
Studying how embeddings are organized in space not only enhances model interpretability but also uncovers factors that drive downstream task performance. In this paper, we present a comprehensive analysis of topological and geometric measures across a wide set of text embedding models and datasets. We find a high degree of redundancy among these measures and observe that individual metrics often fail to sufficiently differentiate embedding spaces. Building on these insights, we introduce Unified Topological Signatures (UTS), a holistic framework for characterizing embedding spaces. We show that UTS can predict model-specific properties and reveal similarities driven by model architecture. Further, we demonstrate the utility of our method by linking topological structure to ranking effectiveness and accurately predicting document retrievability. We find that a holistic, multi-attribute perspective is essential to understanding and leveraging the geometry of text embeddings.
CLJun 1, 2025
Less is More: Local Intrinsic Dimensions of Contextual Language ModelsBenjamin Matthias Ruppik, Julius von Rohrscheidt, Carel van Niekerk et al.
Understanding the internal mechanisms of large language models (LLMs) remains a challenging and complex endeavor. Even fundamental questions, such as how fine-tuning affects model behavior, often require extensive empirical evaluation. In this paper, we introduce a novel perspective based on the geometric properties of contextual latent embeddings to study the effects of training and fine-tuning. To that end, we measure the local dimensions of a contextual language model's latent space and analyze their shifts during training and fine-tuning. We show that the local dimensions provide insights into the model's training dynamics and generalization ability. Specifically, the mean of the local dimensions predicts when the model's training capabilities are exhausted, as exemplified in a dialogue state tracking task, overfitting, as demonstrated in an emotion recognition task, and grokking, as illustrated with an arithmetic task. Furthermore, our experiments suggest a practical heuristic: reductions in the mean local dimension tend to accompany and predict subsequent performance gains. Through this exploration, we aim to provide practitioners with a deeper understanding of the implications of fine-tuning on embedding spaces, facilitating informed decisions when configuring models for specific applications. The results of this work contribute to the ongoing discourse on the interpretability, adaptability, and generalizability of LLMs by bridging the gap between intrinsic model mechanisms and geometric properties in the respective embeddings.
LGFeb 14, 2024
The Manifold Density Function: An Intrinsic Method for the Validation of Manifold LearningBenjamin Holmgren, Eli Quist, Jordan Schupbach et al.
We introduce the manifold density function, which is an intrinsic method to validate manifold learning techniques. Our approach adapts and extends Ripley's $K$-function, and categorizes in an unsupervised setting the extent to which an output of a manifold learning algorithm captures the structure of a latent manifold. Our manifold density function generalizes to broad classes of Riemannian manifolds. In particular, we extend the manifold density function to general two-manifolds using the Gauss-Bonnet theorem, and demonstrate that the manifold density function for hypersurfaces is well approximated using the first Laplacian eigenvalue. We prove desirable convergence and robustness properties.
CHEM-PHMay 30, 2023
MAGNet: Motif-Agnostic Generation of Molecules from ShapesLeon Hetzel, Johanna Sommer, Bastian Rieck et al.
Recent advances in machine learning for molecules exhibit great potential for facilitating drug discovery from in silico predictions. Most models for molecule generation rely on the decomposition of molecules into frequently occurring substructures (motifs), from which they generate novel compounds. While motif representations greatly aid in learning molecular distributions, such methods struggle to represent substructures beyond their known motif set. To alleviate this issue and increase flexibility across datasets, we propose MAGNet, a graph-based model that generates abstract shapes before allocating atom and bond types. To this end, we introduce a novel factorisation of the molecules' data distribution that accounts for the molecules' global context and facilitates learning adequate assignments of atoms and bonds onto shapes. Despite the added complexity of shape abstractions, MAGNet outperforms most other graph-based approaches on standard benchmarks. Importantly, we demonstrate that MAGNet's improved expressivity leads to molecules with more topologically distinct structures and, at the same time, diverse atom and bond assignments.
CGMay 10, 2023
NervePool: A Simplicial Pooling LayerSarah McGuire Scullen, Ernst Röell, Elizabeth Munch et al.
For deep learning problems on graph-structured data, pooling layers are important for down sampling, reducing computational cost, and to minimize overfitting. We define a pooling layer, nervePool, for data structured as simplicial complexes, which are generalizations of graphs that include higher-dimensional simplices beyond vertices and edges; this structure allows for greater flexibility in modeling higher-order relationships. The proposed simplicial coarsening scheme is built upon partitions of vertices, which allow us to generate hierarchical representations of simplicial complexes, collapsing information in a learned fashion. NervePool builds on the learned vertex cluster assignments and extends to coarsening of higher dimensional simplices in a deterministic fashion. While in practice the pooling operations are computed via a series of matrix operations, the topological motivation is a set-theoretic construction based on unions of stars of simplices and the nerve complex.
LGMay 9, 2023
Metric Space Magnitude and Generalisation in Neural NetworksRayna Andreeva, Katharina Limbeck, Bastian Rieck et al.
Deep learning models have seen significant successes in numerous applications, but their inner workings remain elusive. The purpose of this work is to quantify the learning process of deep neural networks through the lens of a novel topological invariant called magnitude. Magnitude is an isometry invariant; its properties are an active area of research as it encodes many known invariants of a metric space. We use magnitude to study the internal representations of neural networks and propose a new method for determining their generalisation capabilities. Moreover, we theoretically connect magnitude dimension and the generalisation error, and demonstrate experimentally that the proposed framework can be a good indicator of the latter.
LGDec 18, 2021
Weisfeiler and Leman go Machine Learning: The Story so farChristopher Morris, Yaron Lipman, Haggai Maron et al.
In recent years, algorithms and neural architectures based on the Weisfeiler--Leman algorithm, a well-known heuristic for the graph isomorphism problem, have emerged as a powerful tool for machine learning with graphs and relational data. Here, we give a comprehensive overview of the algorithm's use in a machine-learning setting, focusing on the supervised regime. We discuss the theoretical background, show how to use it for supervised graph and node representation learning, discuss recent extensions, and outline the algorithm's connection to (permutation-)equivariant neural architectures. Moreover, we give an overview of current applications and future directions to stimulate further research.
IVNov 15, 2021
Interpretability Aware Model Training to Improve Robustness against Out-of-Distribution Magnetic Resonance Images in Alzheimer's Disease ClassificationMerel Kuijs, Catherine R. Jutzeler, Bastian Rieck et al.
Owing to its pristine soft-tissue contrast and high resolution, structural magnetic resonance imaging (MRI) is widely applied in neurology, making it a valuable data source for image-based machine learning (ML) and deep learning applications. The physical nature of MRI acquisition and reconstruction, however, causes variations in image intensity, resolution, and signal-to-noise ratio. Since ML models are sensitive to such variations, performance on out-of-distribution data, which is inherent to the setting of a deployed healthcare ML application, typically drops below acceptable levels. We propose an interpretability aware adversarial training regime to improve robustness against out-of-distribution samples originating from different MRI hardware. The approach is applied to 1.5T and 3T MRIs obtained from the Alzheimer's Disease Neuroimaging Initiative database. We present preliminary results showing promising performance on out-of-distribution samples.
LGOct 28, 2021
The magnitude vector of imagesMichael F. Adamer, Edward De Brouwer, Leslie O'Bray et al.
The magnitude of a finite metric space has recently emerged as a novel invariant quantity, allowing to measure the effective size of a metric space. Despite encouraging first results demonstrating the descriptive abilities of the magnitude, such as being able to detect the boundary of a metric space, the potential use cases of magnitude remain under-explored. In this work, we investigate the properties of the magnitude on images, an important data modality in many machine learning applications. By endowing each individual images with its own metric space, we are able to define the concept of magnitude on images and analyse the individual contribution of each pixel with the magnitude vector. In particular, we theoretically show that the previously known properties of boundary detection translate to edge detection abilities in images. Furthermore, we demonstrate practical use cases of magnitude for machine learning applications and propose a novel magnitude model that consists of a computationally efficient magnitude computation and a learnable metric. By doing so, we address the computational hurdle that used to make magnitude impractical for many applications and open the way for the adoption of magnitude in machine learning research.