LG AI DS MLMay 16, 2019

Scalable Graph Embeddings via Sparse Transpose Proximities

arXiv:1905.07245v19.962 citations

Originality Highly original

AI Analysis

This addresses scalability and accuracy limitations in graph embedding for applications requiring node representations in large networks.

The paper tackled three fundamental problems in graph embedding: failure to preserve out-degree distributions on directed graphs, conflicting optimization goals from random walk proximities on undirected graphs, and the inability to achieve both scalability and non-linearity simultaneously. They proposed STRAP, a factorization-based algorithm using sparse transpose proximities, which outperformed state-of-the-art methods in effectiveness and scalability.

Graph embedding learns low-dimensional representations for nodes in a graph and effectively preserves the graph structure. Recently, a significant amount of progress has been made toward this emerging research area. However, there are several fundamental problems that remain open. First, existing methods fail to preserve the out-degree distributions on directed graphs. Second, many existing methods employ random walk based proximities and thus suffer from conflicting optimization goals on undirected graphs. Finally, existing factorization methods are unable to achieve scalability and non-linearity simultaneously. This paper presents an in-depth study on graph embedding techniques on both directed and undirected graphs. We analyze the fundamental reasons that lead to the distortion of out-degree distributions and to the conflicting optimization goals. We propose {\em transpose proximity}, a unified approach that solves both problems. Based on the concept of transpose proximity, we design \strap, a factorization based graph embedding algorithm that achieves scalability and non-linearity simultaneously. \strap makes use of the {\em backward push} algorithm to efficiently compute the sparse {\em Personalized PageRank (PPR)} as its transpose proximities. By imposing the sparsity constraint, we are able to apply non-linear operations to the proximity matrix and perform efficient matrix factorization to derive the embedding vectors. Finally, we present an extensive experimental study that evaluates the effectiveness of various graph embedding algorithms, and we show that \strap outperforms the state-of-the-art methods in terms of effectiveness and scalability.

View on arXiv PDF

Similar