Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks
This work addresses the problem of metric selection for graph data to improve generalization and robustness in graph neural networks, representing an incremental advancement in graph learning.
The authors tackled the challenge of defining an appropriate metric for non-Euclidean graph data by proposing the Tree Mover's Distance (TMD), a pseudometric that relates to graph neural network generalization, showing that a TMD-SVM performs competitively with standard GNNs and correlates well with performance drop under distribution shifts.
Understanding generalization and robustness of machine learning models fundamentally relies on assuming an appropriate metric on the data space. Identifying such a metric is particularly challenging for non-Euclidean data such as graphs. Here, we propose a pseudometric for attributed graphs, the Tree Mover's Distance (TMD), and study its relation to generalization. Via a hierarchical optimal transport problem, TMD reflects the local distribution of node attributes as well as the distribution of local computation trees, which are known to be decisive for the learning behavior of graph neural networks (GNNs). First, we show that TMD captures properties relevant to graph classification: a simple TMD-SVM performs competitively with standard GNNs. Second, we relate TMD to generalization of GNNs under distribution shifts, and show that it correlates well with performance drop under such shifts.