ML AI LGNov 6, 2024

Graph neural networks and non-commuting operators

Mauricio Velasco, Kaiying O'Hare, Bernardo Rychtenberg, Soledad Villar

arXiv:2411.04265v19.26 citationsh-index: 2Has CodeNIPS

Originality Incremental advance

AI Analysis

This work addresses a generalization of GNNs for multi-graph learning, providing theoretical improvements in transferability, but it is incremental as it builds on existing GNN frameworks.

The paper tackles the problem of extending graph neural networks (GNNs) to handle multiple graphs with shared vertices, developing a theory for stability and transferability in non-commuting operators, and proves a universal transferability theorem for graph-tuple neural networks (GtNNs).

Graph neural networks (GNNs) provide state-of-the-art results in a wide variety of tasks which typically involve predicting features at the vertices of a graph. They are built from layers of graph convolutions which serve as a powerful inductive bias for describing the flow of information among the vertices. Often, more than one data modality is available. This work considers a setting in which several graphs have the same vertex set and a common vertex-level learning task. This generalizes standard GNN models to GNNs with several graph operators that do not commute. We may call this model graph-tuple neural networks (GtNN). In this work, we develop the mathematical theory to address the stability and transferability of GtNNs using properties of non-commuting non-expansive operators. We develop a limit theory of graphon-tuple neural networks and use it to prove a universal transferability theorem that guarantees that all graph-tuple neural networks are transferable on convergent graph-tuple sequences. In particular, there is no non-transferable energy under the convergence we consider here. Our theoretical results extend well-known transferability theorems for GNNs to the case of several simultaneous graphs (GtNNs) and provide a strict improvement on what is currently known even in the GNN case. We illustrate our theoretical results with simple experiments on synthetic and real-world data. To this end, we derive a training procedure that provably enforces the stability of the resulting model.

View on arXiv PDF Code

Similar