LG MLSep 10, 2018

Learning-based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching

Yunsheng Bai, Hao Ding, Yizhou Sun, Wei Wang

arXiv:1809.04440v217.1131 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a core operation in graph-based applications like similarity search and clustering, offering an incremental improvement over existing neural network methods.

The paper tackles the problem of graph similarity computation by directly matching sets of node embeddings instead of using fixed-dimensional graph embeddings, achieving state-of-the-art performance on four real-world datasets in six out of eight settings.

Graph similarity computation is one of the core operations in many graph-based applications, such as graph similarity search, graph database analysis, graph clustering, etc. Since computing the exact distance/similarity between two graphs is typically NP-hard, a series of approximate methods have been proposed with a trade-off between accuracy and speed. Recently, several data-driven approaches based on neural networks have been proposed, most of which model the graph-graph similarity as the inner product of their graph-level representations, with different techniques proposed for generating one embedding per graph. However, using one fixed-dimensional embedding per graph may fail to fully capture graphs in varying sizes and link structures, a limitation that is especially problematic for the task of graph similarity computation, where the goal is to find the fine-grained difference between two graphs. In this paper, we address the problem of graph similarity computation from another perspective, by directly matching two sets of node embeddings without the need to use fixed-dimensional vectors to represent whole graphs for their similarity computation. The model, GraphSim, achieves the state-of-the-art performance on four real-world graph datasets under six out of eight settings (here we count a specific dataset and metric combination as one setting), compared to existing popular methods for approximate Graph Edit Distance (GED) and Maximum Common Subgraph (MCS) computation.

View on arXiv PDF Code

Similar