ME IR LG MLApr 17, 2014

A New Space for Comparing Graphs

arXiv:1404.4644v134 citations

Originality Incremental advance

AI Analysis

This work addresses the need for efficient and effective graph comparison methods, which is crucial for machine learning tasks like classification and clustering on graph datasets, though it is incremental as it builds on existing mathematical frameworks.

The authors tackled the problem of comparing different graph structures by proposing a new symmetric positive semidefinite matrix representation based on covariance between normalized vectors from adjacency matrix powers, which encodes spectral and substructural information. They showed that using a Bhattacharya similarity measure on this representation outperforms state-of-the-art methods in social network classification tasks, with computational efficiency linear in the number of edges.

Finding a new mathematical representations for graph, which allows direct comparison between different graph structures, is an open-ended research direction. Having such a representation is the first prerequisite for a variety of machine learning algorithms like classification, clustering, etc., over graph datasets. In this paper, we propose a symmetric positive semidefinite matrix with the $(i,j)$-{th} entry equal to the covariance between normalized vectors $A^ie$ and $A^je$ ($e$ being vector of all ones) as a representation for graph with adjacency matrix $A$. We show that the proposed matrix representation encodes the spectrum of the underlying adjacency matrix and it also contains information about the counts of small sub-structures present in the graph such as triangles and small paths. In addition, we show that this matrix is a \emph{"graph invariant"}. All these properties make the proposed matrix a suitable object for representing graphs. The representation, being a covariance matrix in a fixed dimensional metric space, gives a mathematical embedding for graphs. This naturally leads to a measure of similarity on graph objects. We define similarity between two given graphs as a Bhattacharya similarity measure between their corresponding covariance matrix representations. As shown in our experimental study on the task of social network classification, such a similarity measure outperforms other widely used state-of-the-art methodologies. Our proposed method is also computationally efficient. The computation of both the matrix representation and the similarity value can be performed in operations linear in the number of edges. This makes our method scalable in practice. We believe our theoretical and empirical results provide evidence for studying truncated power iterations, of the adjacency matrix, to characterize social networks.

View on arXiv PDF

Similar