On a linear fused Gromov-Wasserstein distance for graph structured data
This work addresses the challenge of efficiently comparing graph-structured data for applications in machine learning, though it is incremental as it builds on existing optimal transport methods.
The paper tackles the problem of measuring similarity between graphs by proposing a novel distance called linearFGW, which embeds graphs into a vector space using optimal transport to account for node features and topology, resulting in faster computation for large-scale datasets and effectiveness in classification and clustering tasks.
We present a framework for embedding graph structured data into a vector space, taking into account node features and topology of a graph into the optimal transport (OT) problem. Then we propose a novel distance between two graphs, named linearFGW, defined as the Euclidean distance between their embeddings. The advantages of the proposed distance are twofold: 1) it can take into account node feature and structure of graphs for measuring the similarity between graphs in a kernel-based framework, 2) it can be much faster for computing kernel matrix than pairwise OT-based distances, particularly fused Gromov-Wasserstein, making it possible to deal with large-scale data sets. After discussing theoretical properties of linearFGW, we demonstrate experimental results on classification and clustering tasks, showing the effectiveness of the proposed linearFGW.