LGDec 24, 2021

GREED: A Neural Framework for Learning Graph Distance Functions

arXiv:2112.13143v366 citationsHas Code
Originality Highly original
AI Analysis

This work addresses a computational bottleneck in graph analysis for researchers and practitioners needing efficient and accurate distance measures, with incremental improvements in speed and property preservation.

The paper tackles the problem of efficiently approximating graph and subgraph edit distances (GED and SED), which are NP-hard to compute exactly, by proposing a neural framework called GREED that learns these distances in a property-preserving manner. The result is that GREED is more accurate than state-of-the-art methods, up to 3 orders of magnitude faster, and due to preserving the triangle inequality, it enables indexable embeddings, making it up to 50 times faster for retrieval tasks even on CPU-only environments.

Among various distance functions for graphs, graph and subgraph edit distances (GED and SED respectively) are two of the most popular and expressive measures. Unfortunately, exact computations for both are NP-hard. To overcome this computational bottleneck, neural approaches to learn and predict edit distance in polynomial time have received much interest. While considerable progress has been made, there exist limitations that need to be addressed. First, the efficacy of an approximate distance function lies not only in its approximation accuracy, but also in the preservation of its properties. To elaborate, although GED is a metric, its neural approximations do not provide such a guarantee. This prohibits their usage in higher order tasks that rely on metric distance functions, such as clustering or indexing. Second, several existing frameworks for GED do not extend to SED due to SED being asymmetric. In this work, we design a novel siamese graph neural network called GREED, which through a carefully crafted inductive bias, learns GED and SED in a property-preserving manner. Through extensive experiments across 10 real graph datasets containing up to 7 million edges, we establish that GREED is not only more accurate than the state of the art, but also up to 3 orders of magnitude faster. Even more significantly, due to preserving the triangle inequality, the generated embeddings are indexable and consequently, even in a CPU-only environment, GREED is up to 50 times faster than GPU-powered baselines for graph / subgraph retrieval.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes