GED-Consistent Disentanglement of Aligned and Unaligned Substructures for Graph Similarity Learning
This work improves graph similarity learning for applications like bioinformatics and social network analysis, but it is incremental as it refines existing GNN-based approaches.
The paper tackles the problem of Graph Similarity Computation (GSC) by addressing limitations in existing Graph Neural Network (GNN) methods that mismatch Graph Edit Distance (GED) principles, proposing GCGSim for graph-level matching and substructure-level edit costs, and achieves state-of-the-art performance on four benchmark datasets.
Graph Similarity Computation (GSC) is a fundamental graph related task where Graph Edit Distance (GED) serves as a prevalent metric. GED is determined by an optimal alignment between a pair of graphs that partitions each into aligned (zero-cost) and unaligned (cost-incurring) substructures. Due to NP-hard nature of exact GED computation, GED approximations based on Graph Neural Network(GNN) have emerged. Existing GNN-based GED approaches typically learn node embeddings for each graph and then aggregate pairwise node similarities to estimate the final similarity. Despite their effectiveness, we identify a mismatch between this prevalent node-centric matching paradigm and the core principles of GED. This discrepancy leads to two critical limitations: (1) a failure to capture the global structural correspondence for optimal alignment, and (2) a misattribution of edit costs driven by spurious node level signals. To address these limitations, we propose GCGSim, a GED-consistent graph similarity learning framework centering on graph-level matching and substructure-level edit costs. Specifically, we make three core technical contributions. Extensive experiments on four benchmark datasets show that GCGSim achieves state-of-the-art performance. Our comprehensive analyses further validate that the framework effectively learns disentangled and semantically meaningful substructure representations.