h-index57
35papers
3,380citations
Novelty49%
AI Score56

35 Papers

LOMar 8, 2023
The Descriptive Complexity of Graph Neural Networks

Martin Grohe

We analyse the power of graph neural networks (GNNs) in terms of Boolean circuit complexity and descriptive complexity. We prove that the graph queries that can be computed by a polynomial-size bounded-depth family of GNNs are exactly those definable in the guarded fragment GFO+C of first-order logic with counting and with built-in relations. This puts GNNs in the circuit complexity class (non-uniform) $\text{TC}^0$. Remarkably, the GNN families may use arbitrary real weights and a wide class of activation functions that includes the standard ReLU, logistic "sigmoid", and hyperbolic tangent functions. If the GNNs are allowed to use random initialisation and global readout (both standard features of GNNs widely used in practice), they can compute exactly the same queries as bounded depth Boolean circuits with threshold gates, that is, exactly the queries in $\text{TC}^0$. Moreover, we show that queries computable by a single GNN with piecewise linear activations and rational weights are definable in GFO+C without built-in relations. Therefore, they are contained in uniform $\text{TC}^0$.

AIAug 22, 2022
One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction

Jan Tönshoff, Berke Kisin, Jakob Lindner et al.

We propose a universal Graph Neural Network architecture which can be trained as an end-2-end search heuristic for any Constraint Satisfaction Problem (CSP). Our architecture can be trained unsupervised with policy gradient descent to generate problem specific heuristics for any CSP in a purely data driven manner. The approach is based on a novel graph representation for CSPs that is both generic and compact and enables us to process every possible CSP instance with one GNN, regardless of constraint arity, relations or domain size. Unlike previous RL-based methods, we operate on a global search action space and allow our GNN to modify any number of variables in every step of the stochastic search. This enables our method to properly leverage the inherent parallelism of GNNs. We perform a thorough empirical evaluation where we learn heuristics for well known and important CSPs from random data, including graph coloring, MaxCut, 3-SAT and MAX-k-SAT. Our approach outperforms prior approaches for neural combinatorial optimization by a substantial margin. It can compete with, and even improve upon, conventional search heuristics on test instances that are several orders of magnitude larger and structurally more complex than those seen during training.

LGJul 27, 2022
Physical Pooling Functions in Graph Neural Networks for Molecular Property Prediction

Artur M. Schweidtmann, Jan G. Rittig, Jana M. Weber et al.

Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to unphysical GNNs that poorly generalize. We compare and select meaningful GNN pooling methods based on physical knowledge about the learned properties. The impact of physical pooling functions is demonstrated with molecular properties calculated from quantum mechanical computations. We also compare our results to the recent set2set pooling approach. We recommend using sum pooling for the prediction of properties that depend on molecular size and compare pooling functions for properties that are molecular size-independent. Overall, we show that the use of physical pooling functions significantly enhances generalization.

LGJan 26, 2023
WL meet VC

Christopher Morris, Floris Geerts, Jan Tönshoff et al.

Recently, many works studied the expressive power of graph neural networks (GNNs) by linking it to the $1$-dimensional Weisfeiler--Leman algorithm ($1\text{-}\mathsf{WL}$). Here, the $1\text{-}\mathsf{WL}$ is a well-studied heuristic for the graph isomorphism problem, which iteratively colors or partitions a graph's vertex set. While this connection has led to significant advances in understanding and enhancing GNNs' expressive power, it does not provide insights into their generalization performance, i.e., their ability to make meaningful predictions beyond the training set. In this paper, we study GNNs' generalization ability through the lens of Vapnik--Chervonenkis (VC) dimension theory in two settings, focusing on graph-level predictions. First, when no upper bound on the graphs' order is known, we show that the bitlength of GNNs' weights tightly bounds their VC dimension. Further, we derive an upper bound for GNNs' VC dimension using the number of colors produced by the $1\text{-}\mathsf{WL}$. Secondly, when an upper bound on the graphs' order is known, we show a tight connection between the number of graphs distinguishable by the $1\text{-}\mathsf{WL}$ and GNNs' VC dimension. Our empirical study confirms the validity of our theoretical findings.

LGJun 1, 2022
Graph Machine Learning for Design of High-Octane Fuels

Jan G. Rittig, Martin Ritzert, Artur M. Schweidtmann et al.

Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.

LGFeb 22, 2023
Some Might Say All You Need Is Sum

Eran Rosenbluth, Jan Toenshoff, Martin Grohe

The expressivity of Graph Neural Networks (GNNs) is dependent on the aggregation functions they employ. Theoretical works have pointed towards Sum aggregation GNNs subsuming every other GNNs, while certain practical works have observed a clear advantage to using Mean and Max. An examination of the theoretical guarantee identifies two caveats. First, it is size-restricted, that is, the power of every specific GNN is limited to graphs of a specific size. Successfully processing larger graphs may require an other GNN, and so on. Second, it concerns the power to distinguish non-isomorphic graphs, not the power to approximate general functions on graphs, and the former does not necessarily imply the latter. It is desired that a GNN's usability will not be limited to graphs of any specific size. Therefore, we explore the realm of unrestricted-size expressivity. We prove that basic functions, which can be computed exactly by Mean or Max GNNs, are inapproximable by any Sum GNN. We prove that under certain restrictions, every Mean or Max GNN can be approximated by a Sum GNN, but even there, a combination of (Sum, [Mean/Max]) is more expressive than Sum alone. Lastly, we prove further expressivity limitations for GNNs with a broad class of aggregations.

LGApr 14, 2022
Solving AC Power Flow with Graph Neural Networks under Realistic Constraints

Luis Böttcher, Hinrikus Wolf, Bastian Jung et al.

In this paper, we propose a graph neural network architecture to solve the AC power flow problem under realistic constraints. To ensure a safe and resilient operation of distribution grids, AC power flow calculations are the means of choice to determine grid operating limits or analyze grid asset utilization in planning procedures. In our approach, we demonstrate the development of a framework that uses graph neural networks to learn the physical constraints of the power flow. We present our model architecture on which we perform unsupervised training to learn a general solution of the AC power flow formulation independent of the specific topologies and supply tasks used for training. Finally, we demonstrate, validate and discuss our results on medium voltage benchmark grids. In our approach, we focus on the physical and topological properties of distribution grids to provide scalable solutions for real grid topologies. Therefore, we take a data-driven approach, using large and diverse data sets consisting of realistic grid topologies, for the unsupervised training of the AC power flow graph neural network architecture and compare the results to a prior neural architecture and the Newton-Raphson method. Our approach shows a high increase in computation time and good accuracy compared to state-of-the-art solvers. It also out-performs that neural solver for power flow in terms of accuracy.

LGAug 29, 2023
Structural Node Embeddings with Homomorphism Counts

Hinrikus Wolf, Luca Oeljeklaus, Pascal Kühner et al.

Graph homomorphism counts, first explored by Lovász in 1967, have recently garnered interest as a powerful tool in graph-based machine learning. Grohe (PODS 2020) proposed the theoretical foundations for using homomorphism counts in machine learning on graph level as well as node level tasks. By their very nature, these capture local structural information, which enables the creation of robust structural embeddings. While a first approach for graph level tasks has been made by Nguyen and Maehara (ICML 2020), we experimentally show the effectiveness of homomorphism count based node embeddings. Enriched with node labels, node weights, and edge weights, these offer an interpretable representation of graph data, allowing for enhanced explainability of machine learning models. We propose a theoretical framework for isomorphism-invariant homomorphism count based embeddings which lend themselves to a wide variety of downstream tasks. Our approach capitalises on the efficient computability of graph homomorphism counts for bounded treewidth graph classes, rendering it a practical solution for real-world applications. We demonstrate their expressivity through experiments on benchmark datasets. Although our results do not match the accuracy of state-of-the-art neural architectures, they are comparable to other advanced graph learning models. Remarkably, our approach demarcates itself by ensuring explainability for each individual feature. By integrating interpretable machine learning algorithms like SVMs or Random Forests, we establish a seamless, end-to-end explainable pipeline. Our study contributes to the advancement of graph-based techniques that offer both performance and interpretability.

AIAug 19, 2024
Query languages for neural networks

Martin Grohe, Christoph Standke, Juno Steegmans et al.

We lay the foundations for a database-inspired approach to interpreting and understanding neural network models by querying them using declarative languages. Towards this end we study different query languages, based on first-order logic, that mainly differ in their access to the neural network model. First-order logic over the reals naturally yields a language which views the network as a black box; only the input--output function defined by the network can be queried. This is essentially the approach of constraint query languages. On the other hand, a white-box language can be obtained by viewing the network as a weighted graph, and extending first-order logic with summation over weight terms. The latter approach is essentially an abstraction of SQL. In general, the two approaches are incomparable in expressive power, as we will show. Under natural circumstances, however, the white-box approach can subsume the black-box approach; this is our main result. We prove the result concretely for linear constraint queries over real functions definable by feedforward neural networks with a fixed number of hidden layers and piecewise linear activation functions.

LOJan 29
How Expressive Are Graph Neural Networks in the Presence of Node Identifiers?

Arie Soeteman, Michael Benedikt, Martin Grohe et al.

Graph neural networks (GNNs) are a widely used class of machine learning models for graph-structured data, based on local aggregation over neighbors. GNNs have close connections to logic. In particular, their expressive power is linked to that of modal logics and bounded-variable logics with counting. In many practical scenarios, graphs processed by GNNs have node features that act as unique identifiers. In this work, we study how such identifiers affect the expressive power of GNNs. We initiate a study of the key-invariant expressive power of GNNs, inspired by the notion of order-invariant definability in finite model theory: which node queries that depend only on the underlying graph structure can GNNs express on graphs with unique node identifiers? We provide answers for various classes of GNNs with local max- or sum-aggregation.

65.0LGApr 20
Tabular foundation models for in-context prediction of molecular properties

Karim K. Ben Hicham, Jan G. Rittig, Martin Grohe et al.

Accurate molecular property prediction is central to drug discovery, catalysis, and process design, yet real-world applications are often limited by small datasets. Molecular foundation models provide a promising direction by learning transferable molecular representations; however, they typically involve task-specific fine-tuning, require machine learning expertise, and often fail to outperform classical baselines. Tabular foundation models (TFMs) offer a fundamentally different paradigm: they perform predictions through in-context learning, enabling inference without task-specific training. Here, we evaluate TFMs in the low- to medium-data regime across both standardized pharmaceutical benchmarks and chemical engineering datasets. We evaluate both frozen molecular foundation model representations, as well as classical descriptors and fingerprints. Across the benchmarks, the approach shows excellent predictive performance while reducing computational cost, compared to fine-tuning, with these advantages also transferring to practical engineering data settings. In particular, combining TFMs with CheMeleon embeddings yields up to 100\% win rates on 30 MoleculeACE tasks, while compact RDKit2d and Mordred descriptors provide strong descriptor-based alternatives. Molecular representation emerges as a key determinant in TFM performance, with molecular foundation model embeddings and 2D descriptor sets both providing substantial gains over classic molecular fingerprints on many tasks. These results suggest that in-context learning with TFMs provides a highly accurate and cost-efficient alternative for property prediction in practical applications.

19.0DSApr 14
Robust Graph Isomorphism, Quadratic Assignment and VC Dimension

Anatole Dahan, Martin Grohe, Daniel Neuen et al.

We present an additive $\varepsilon n^{2}$-approximation algorithm for the Graph Edit Distance problem (GED) on graphs of VC dimension $d$ running in time $n^{O(d/\varepsilon^{2})}$. In particular, this recovers a previous result by Arora, Frieze, and Kaplan [Math. Program. 2002] who gave an $\varepsilon n^{2}$-approximation running in time $n^{O(\log n/\varepsilon^{2})}$. Similar to the work of Arora et al., we extend our results to arbitrary Quadratic Assignment problems (QAPs) by introducing a notion of VC dimension for QAP instances, and giving an $\varepsilon n^{2}$-approximation for QAPs with bounded weights running in time $n^{O(\varepsilon^{-2}(d + \log\varepsilon^{-1}))}$. As a particularly interesting special case, we further study the problem $\varepsilon$-$\mathsf{GI}$, which entails determining if two graphs $G,H$ over $n$ vertices are isomorphic, when promised that if they are not, their graph edit distance is at least $\varepsilon n^{2}$. We show that the standard Weisfeiler--Leman algorithm of dimension $O(\varepsilon^{-1}d\log(\varepsilon^{-1}))$ solves this problem on graphs of VC dimension $d$. We also show that dimension $O(\varepsilon^{-1}\log n)$ suffices on arbitrary $n$-vertex graphs, while $k$-WL fails on instances at distance $Ω(n^{2}/k)$.

LGFeb 12
Amortized Molecular Optimization via Group Relative Policy Optimization

Muhammad bin Javaid, Hasham Hussain, Ashima Khanna et al.

Molecular design encompasses tasks ranging from de-novo design to structural alteration of given molecules or fragments. For the latter, state-of-the-art methods predominantly function as "Instance Optimizers'', expending significant compute restarting the search for every input structure. While model-based approaches theoretically offer amortized efficiency by learning a policy transferable to unseen structures, existing methods struggle to generalize. We identify a key failure mode: the high variance arising from the heterogeneous difficulty of distinct starting structures. To address this, we introduce GRXForm, adapting a pre-trained Graph Transformer model that optimizes molecules via sequential atom-and-bond additions. We employ Group Relative Policy Optimization (GRPO) for goal-directed fine-tuning to mitigate variance by normalizing rewards relative to the starting structure. Empirically, GRXForm generalizes to out-of-distribution molecular scaffolds without inference-time oracle calls or refinement, achieving scores in multi-objective optimization competitive with leading instance optimizers.

LGMay 20, 2024
Distinguished In Uniform: Self Attention Vs. Virtual Nodes

Eran Rosenbluth, Jan Tönshoff, Martin Ritzert et al.

Graph Transformers (GTs) such as SAN and GPS are graph processing models that combine Message-Passing GNNs (MPGNNs) with global Self-Attention. They were shown to be universal function approximators, with two reservations: 1. The initial node features must be augmented with certain positional encodings. 2. The approximation is non-uniform: Graphs of different sizes may require a different approximating network. We first clarify that this form of universality is not unique to GTs: Using the same positional encodings, also pure MPGNNs and even 2-layer MLPs are non-uniform universal approximators. We then consider uniform expressivity: The target function is to be approximated by a single network for graphs of all sizes. There, we compare GTs to the more efficient MPGNN + Virtual Node architecture. The essential difference between the two model definitions is in their global computation method -- Self-Attention Vs Virtual Node. We prove that none of the models is a uniform-universal approximator, before proving our main result: Neither model's uniform expressivity subsumes the other's. We demonstrate the theory with experiments on synthetic data. We further augment our study with real-world datasets, observing mixed results which indicate no clear ranking in practice as well.

LGFeb 3, 2024
Future Directions in the Theory of Graph Machine Learning

Christopher Morris, Fabrizio Frasca, Nadav Dym et al. · nvidia

Machine learning on graphs, especially using graph neural networks (GNNs), has seen a surge in interest due to the wide availability of graph data across a broad spectrum of disciplines, from life to social and engineering sciences. Despite their practical success, our theoretical understanding of the properties of GNNs remains highly incomplete. Recent theoretical advancements primarily focus on elucidating the coarse-grained expressive power of GNNs, predominantly employing combinatorial techniques. However, these studies do not perfectly align with practice, particularly in understanding the generalization behavior of GNNs when trained with stochastic first-order optimization techniques. In this position paper, we argue that the graph machine learning community needs to shift its attention to developing a balanced theory of graph machine learning, focusing on a more thorough understanding of the interplay of expressive power, generalization, and optimization.

LOJan 14
Query Languages for Machine-Learning Models

Martin Grohe

In this paper, I discuss two logics for weighted finite structures: first-order logic with summation (FO(SUM)) and its recursive extension IFP(SUM). These logics originate from foundational work by Grädel, Gurevich, and Meer in the 1990s. In recent joint work with Standke, Steegmans, and Van den Bussche, we have investigated these logics as query languages for machine learning models, specifically neural networks, which are naturally represented as weighted graphs. I present illustrative examples of queries to neural networks that can be expressed in these logics and discuss fundamental results on their expressiveness and computational complexity.

LGNov 3, 2024
GraphXForm: Graph transformer for computer-aided molecular design

Jonathan Pirnay, Jan G. Rittig, Alexander B. Wolf et al.

Generative deep learning has become pivotal in molecular design for drug discovery, materials science, and chemical engineering. A widely used paradigm is to pretrain neural networks on string representations of molecules and fine-tune them using reinforcement learning on specific objectives. However, string-based models face challenges in ensuring chemical validity and enforcing structural constraints like the presence of specific substructures. We propose to instead combine graph-based molecular representations, which can naturally ensure chemical validity, with transformer architectures, which are highly expressive and capable of modeling long-range dependencies between atoms. Our approach iteratively modifies a molecular graph by adding atoms and bonds, which ensures chemical validity and facilitates the incorporation of structural constraints. We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds and then fine-tuned using a new training algorithm that combines elements of the deep cross-entropy method and self-improvement learning. We evaluate GraphXForm on various drug design tasks, demonstrating superior objective scores compared to state-of-the-art molecular design approaches. Furthermore, we apply GraphXForm to two solvent design tasks for liquid-liquid extraction, again outperforming alternative methods while flexibly enforcing structural constraints or initiating design from existing molecular structures.

LOMar 11, 2024
Are Targeted Messages More Effective?

Martin Grohe, Eran Rosenbluth

Graph neural networks (GNN) are deep learning architectures for graphs. Essentially, a GNN is a distributed message passing algorithm, which is controlled by parameters learned from data. It operates on the vertices of a graph: in each iteration, vertices receive a message on each incoming edge, aggregate these messages, and then update their state based on their current state and the aggregated messages. The expressivity of GNNs can be characterised in terms of certain fragments of first-order logic with counting and the Weisfeiler-Lehman algorithm. The core GNN architecture comes in two different versions. In the first version, a message only depends on the state of the source vertex, whereas in the second version it depends on the states of the source and target vertices. In practice, both of these versions are used, but the theory of GNNs so far mostly focused on the first one. On the logical side, the two versions correspond to two fragments of first-order logic with counting that we call modal and guarded. The question whether the two versions differ in their expressivity has been mostly overlooked in the GNN literature and has only been asked recently (Grohe, LICS'23). We answer this question here. It turns out that the answer is not as straightforward as one might expect. By proving that the modal and guarded fragment of first-order logic with counting have the same expressivity over labelled undirected graphs, we show that in a non-uniform setting the two GNN versions have the same expressivity. However, we also prove that in a uniform setting the second version is strictly more expressive.

CHEM-PHAug 28, 2025
Molecular Machine Learning in Chemical Process Design

Jan G. Rittig, Manuel Dahmen, Martin Grohe et al.

We present a perspective on molecular machine learning (ML) in the field of chemical process engineering. Recently, molecular ML has demonstrated great potential in (i) providing highly accurate predictions for properties of pure components and their mixtures, and (ii) exploring the chemical space for new molecular structures. We review current state-of-the-art molecular ML models and discuss research directions that promise further advancements. This includes ML methods, such as graph neural networks and transformers, which can be further advanced through the incorporation of physicochemical knowledge in a hybrid or physics-informed fashion. Then, we consider leveraging molecular ML at the chemical process scale, which is highly desirable yet rather unexplored. We discuss how molecular ML can be integrated into process design and optimization formulations, promising to accelerate the identification of novel molecules and processes. To this end, it will be essential to create molecule and process design benchmarks and practically validate proposed candidates, possibly in collaboration with the chemical industry.

LGMay 21, 2025
Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs

Jan Tönshoff, Martin Grohe

Boolean Satisfiability (SAT) solvers are foundational to computer science, yet their performance typically hinges on hand-crafted heuristics. This work introduces Reinforcement Learning from Algorithm Feedback (RLAF) as a paradigm for learning to guide SAT solver branching heuristics with Graph Neural Networks (GNNs). Central to our approach is a novel and generic mechanism for injecting inferred variable weights and polarities into the branching heuristics of existing SAT solvers. In a single forward pass, a GNN assigns these parameters to all variables. Casting this one-shot guidance as a reinforcement learning problem lets us train the GNN with off-the-shelf policy-gradient methods, such as GRPO, directly using the solver's computational cost as the sole reward signal. Extensive evaluations demonstrate that RLAF-trained policies significantly reduce the mean solve times of different base solvers across diverse SAT problem distributions, achieving more than a 2x speedup in some cases, while generalizing effectively to larger and harder problems after training. Notably, these policies consistently outperform expert-supervised approaches based on learning handcrafted weighting heuristics, offering a promising path towards data-driven heuristic design in combinatorial optimization.

LGMay 1, 2025
Repetition Makes Perfect: Recurrent Graph Neural Networks Match Message Passing Limit

Eran Rosenbluth, Martin Grohe

We precisely characterize the expressivity of computable Recurrent Graph Neural Networks (recurrent GNNs). We prove that recurrent GNNs with finite-precision parameters, sum aggregation, and ReLU activation, can compute any graph algorithm that respects the natural message-passing invariance induced by the Color Refinement (or Weisfeiler-Leman) algorithm. While it is well known that the expressive power of GNNs is limited by this invariance [Morris et al., AAAI 2019; Xu et al., ICLR 2019], we establish that recurrent GNNs can actually match this limit. This is in contrast to non-recurrent GNNs, which have the power of Weisfeiler-Leman only in a very weak, "non-uniform", sense where each graph size requires a different GNN to compute with. Our construction introduces only a polynomial overhead in both time and space. Furthermore, we show that by incorporating random initialization, for connected graphs recurrent GNNs can express all graph algorithms. In particular, any polynomial-time graph algorithm can be emulated on connected graphs in polynomial time by a recurrent GNN with random initialization.

LGMay 6, 2024
End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability

Hinrikus Wolf, Luis Böttcher, Sarra Bouchkati et al.

In the course of the energy transition, the expansion of generation and consumption will change, and many of these technologies, such as PV systems, electric cars and heat pumps, will influence the power flow, especially in the distribution grids. Scalable methods that can make decisions for each grid connection are needed to enable congestion-free grid operation in the distribution grids. This paper presents a novel end-to-end approach to resolving congestion in distribution grids with deep reinforcement learning. Our architecture learns to curtail power and set appropriate reactive power to determine a non-congested and, thus, feasible grid state. State-of-the-art methods such as the optimal power flow (OPF) demand high computational costs and detailed measurements of every bus in a grid. In contrast, the presented method enables decisions under sparse information with just some buses observable in the grid. Distribution grids are generally not yet fully digitized and observable, so this method can be used for decision-making on the majority of low-voltage grids. On a real low-voltage grid the approach resolves 100\% of violations in the voltage band and 98.8\% of asset overloads. The results show that decisions can also be made on real grids that guarantee sufficient quality for congestion-free grid operation.

LGJan 20, 2024
Selecting Walk Schemes for Database Embedding

Yuval Lev Lubarsky, Jan Tönshoff, Martin Grohe et al.

Machinery for data analysis often requires a numeric representation of the input. Towards that, a common practice is to embed components of structured data into a high-dimensional vector space. We study the embedding of the tuples of a relational database, where existing techniques are often based on optimization tasks over a collection of random walks from the database. The focus of this paper is on the recent FoRWaRD algorithm that is designed for dynamic databases, where walks are sampled by following foreign keys between tuples. Importantly, different walks have different schemas, or "walk schemes", that are derived by listing the relations and attributes along the walk. Also importantly, different walk schemes describe relationships of different natures in the database. We show that by focusing on a few informative walk schemes, we can obtain tuple embedding significantly faster, while retaining the quality. We define the problem of scheme selection for tuple embedding, devise several approaches and strategies for scheme selection, and conduct a thorough empirical study of the performance over a collection of downstream tasks. Our results confirm that with effective strategies for scheme selection, we can obtain high-quality embeddings considerably (e.g., three times) faster, preserve the extensibility to newly inserted tuples, and even achieve an increase in the precision of some tasks.

LGSep 1, 2023
Where Did the Gap Go? Reassessing the Long-Range Graph Benchmark

Jan Tönshoff, Martin Ritzert, Eran Rosenbluth et al.

The recent Long-Range Graph Benchmark (LRGB, Dwivedi et al. 2022) introduced a set of graph learning tasks strongly dependent on long-range interaction between vertices. Empirical evidence suggests that on these tasks Graph Transformers significantly outperform Message Passing GNNs (MPGNNs). In this paper, we carefully reevaluate multiple MPGNN baselines as well as the Graph Transformer GPS (Rampášek et al. 2022) on LRGB. Through a rigorous empirical analysis, we demonstrate that the reported performance gap is overestimated due to suboptimal hyperparameter choices. It is noteworthy that across multiple datasets the performance gap completely vanishes after basic hyperparameter optimization. In addition, we discuss the impact of lacking feature normalization for LRGB's vision datasets and highlight a spurious implementation of LRGB's link prediction metric. The principal aim of our paper is to establish a higher standard of empirical rigor within the graph machine learning community.

LGDec 18, 2021
Weisfeiler and Leman go Machine Learning: The Story so far

Christopher Morris, Yaron Lipman, Haggai Maron et al.

In recent years, algorithms and neural architectures based on the Weisfeiler--Leman algorithm, a well-known heuristic for the graph isomorphism problem, have emerged as a powerful tool for machine learning with graphs and relational data. Here, we give a comprehensive overview of the algorithm's use in a machine-learning setting, focusing on the supervised regime. We discuss the theoretical background, show how to use it for supervised graph and node representation learning, discuss recent extensions, and outline the algorithm's connection to (permutation-)equivariant neural architectures. Moreover, we give an overview of current applications and future directions to stimulate further research.

LGApr 29, 2021
The Logic of Graph Neural Networks

Martin Grohe

Graph neural networks (GNNs) are deep learning architectures for machine learning problems on graphs. It has recently been shown that the expressiveness of GNNs can be characterised precisely by the combinatorial Weisfeiler-Leman algorithms and by finite variable counting logics. The correspondence has even led to new, higher-order GNNs corresponding to the WL algorithm in higher dimensions. The purpose of this paper is to explain these descriptive characterisations of GNNs.

LGFeb 17, 2021
Walking Out of the Weisfeiler Leman Hierarchy: Graph Learning Beyond Message Passing

Jan Tönshoff, Martin Ritzert, Hinrikus Wolf et al.

We propose CRaWl, a novel neural network architecture for graph learning. Like graph neural networks, CRaWl layers update node features on a graph and thus can freely be combined or interleaved with GNN layers. Yet CRaWl operates fundamentally different from message passing graph neural networks. CRaWl layers extract and aggregate information on subgraphs appearing along random walks through a graph using 1D Convolutions. Thereby it detects long range interactions and computes non-local features. As the theoretical basis for our approach, we prove a theorem stating that the expressiveness of CRaWl is incomparable with that of the Weisfeiler Leman algorithm and hence with graph neural networks. That is, there are functions expressible by CRaWl, but not by GNNs and vice versa. This result extends to higher levels of the Weisfeiler Leman hierarchy and thus to higher-order GNNs. Empirically, we show that CRaWl matches state-of-the-art GNN architectures across a multitude of benchmark datasets for classification and regression on graphs.

LGOct 2, 2020
The Surprising Power of Graph Neural Networks with Random Node Initialization

Ralph Abboud, İsmail İlkan Ceylan, Martin Grohe et al.

Graph neural networks (GNNs) are effective models for representation learning on relational data. However, standard GNNs are limited in their expressive power, as they cannot distinguish graphs beyond the capability of the Weisfeiler-Leman graph isomorphism heuristic. In order to break this expressiveness barrier, GNNs have been enhanced with random node initialization (RNI), where the idea is to train and run the models with randomized initial node features. In this work, we analyze the expressive power of GNNs with RNI, and prove that these models are universal, a first such result for GNNs not relying on computationally demanding higher-order properties. This universality result holds even with partially randomized initial node features, and preserves the invariance properties of GNNs in expectation. We then empirically analyze the effect of RNI on GNNs, based on carefully constructed datasets. Our empirical findings support the superior performance of GNNs with RNI over standard GNNs.

LGMay 20, 2020
The Effects of Randomness on the Stability of Node Embeddings

Tobias Schumacher, Hinrikus Wolf, Martin Ritzert et al.

We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.

LGMar 27, 2020
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data

Martin Grohe

Vector representations of graphs and relational structures, whether hand-crafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view. Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.

AISep 18, 2019
Graph Neural Networks for Maximum Constraint Satisfaction

Jan Toenshoff, Martin Ritzert, Hinrikus Wolf et al.

Many combinatorial optimization problems can be phrased in the language of constraint satisfaction problems. We introduce a graph neural network architecture for solving such optimization problems. The architecture is generic; it works for all binary constraint satisfaction problems. Training is unsupervised, and it is sufficient to train on relatively small instances; the resulting networks perform well on much larger instances (at least 10-times larger). We experimentally evaluate our approach for a variety of problems, including Maximum Cut and Maximum Independent Set. Despite being generic, we show that our approach matches or surpasses most greedy and semi-definite programming based algorithms and sometimes even outperforms state-of-the-art heuristics for the specific problems.

LGOct 4, 2018
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks

Christopher Morris, Martin Ritzert, Matthias Fey et al.

In recent years, graph neural networks (GNNs) have emerged as a powerful neural architecture to learn vector representations of nodes and graphs in a supervised, end-to-end fashion. Up to now, GNNs have only been evaluated empirically -- showing promising results. The following work investigates GNNs from a theoretical point of view and relates them to the $1$-dimensional Weisfeiler-Leman graph isomorphism heuristic ($1$-WL). We show that GNNs have the same expressiveness as the $1$-WL in terms of distinguishing non-isomorphic (sub-)graphs. Hence, both algorithms also have the same shortcomings. Based on this, we propose a generalization of GNNs, so-called $k$-dimensional GNNs ($k$-GNNs), which can take higher-order graph structures at multiple scales into account. These higher-order structures play an essential role in the characterization of social networks and molecule graphs. Our experimental evaluation confirms our theoretical findings as well as confirms that higher-order information is useful in the task of graph classification and regression.

LGAug 27, 2017
Learning MSO-definable hypotheses on string

Martin Grohe, Christof Löding, Martin Ritzert

We study the classification problems over string data for hypotheses specified by formulas of monadic second-order logic MSO. The goal is to design learning algorithms that run in time polynomial in the size of the training set, independently of or at least sublinear in the size of the whole data set. We prove negative as well as positive results. If the data set is an unprocessed string to which our algorithms have local access, then learning in sublinear time is impossible even for hypotheses definable in a small fragment of first-order logic. If we allow for a linear time pre-processing of the string data to build an index data structure, then learning of MSO-definable hypotheses is possible in time polynomial in the size of the training set, independently of the size of the whole data set.

LGJan 19, 2017
Learning first-order definable concepts over structures of small degree

Martin Grohe, Martin Ritzert

We consider a declarative framework for machine learning where concepts and hypotheses are defined by formulas of a logic over some background structure. We show that within this framework, concepts defined by first-order formulas over a background structure of at most polylogarithmic degree can be learned in polylogarithmic time in the "probably approximately correct" learning sense.

DSJul 22, 2013
Dimension Reduction via Colour Refinement

Martin Grohe, Kristian Kersting, Martin Mladenov et al.

Colour refinement is a basic algorithmic routine for graph isomorphism testing, appearing as a subroutine in almost all practical isomorphism solvers. It partitions the vertices of a graph into "colour classes" in such a way that all vertices in the same colour class have the same number of neighbours in every colour class. Tinhofer (Disc. App. Math., 1991), Ramana, Scheinerman, and Ullman (Disc. Math., 1994) and Godsil (Lin. Alg. and its App., 1997) established a tight correspondence between colour refinement and fractional isomorphisms of graphs, which are solutions to the LP relaxation of a natural ILP formulation of graph isomorphism. We introduce a version of colour refinement for matrices and extend existing quasilinear algorithms for computing the colour classes. Then we generalise the correspondence between colour refinement and fractional automorphisms and develop a theory of fractional automorphisms and isomorphisms of matrices. We apply our results to reduce the dimensions of systems of linear equations and linear programs. Specifically, we show that any given LP L can efficiently be transformed into a (potentially) smaller LP L' whose number of variables and constraints is the number of colour classes of the colour refinement algorithm, applied to a matrix associated with the LP. The transformation is such that we can easily (by a linear mapping) map both feasible and optimal solutions back and forth between the two LPs. We demonstrate empirically that colour refinement can indeed greatly reduce the cost of solving linear programs.