SI LG MLFeb 22, 2017

Distributed Representation of Subgraphs

Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, B. Aditya Prakash

arXiv:1702.06921v16.69 citations

Originality Highly original

AI Analysis

This addresses the need for better subgraph representations in network mining tasks, offering a novel approach for researchers in graph analysis.

The paper tackles the problem of learning feature representations for subgraphs in networks, which is ill-suited for node-based embeddings, and proposes sub2vec, an unsupervised scalable algorithm that achieves significant gains over state-of-the-art methods in tasks like community detection.

Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in order to exploit machine learning algorithms for mining tasks like node classification and edge prediction. However, most of the work focuses on finding distributed representations of nodes, which are inherently ill-suited to tasks such as community detection which are intuitively dependent on subgraphs. Here, we propose sub2vec, an unsupervised scalable algorithm to learn feature representations of arbitrary subgraphs. We provide means to characterize similarties between subgraphs and provide theoretical analysis of sub2vec and demonstrate that it preserves the so-called local proximity. We also highlight the usability of sub2vec by leveraging it for network mining tasks, like community detection. We show that sub2vec gets significant gains over state-of-the-art methods and node-embedding methods. In particular, sub2vec offers an approach to generate a richer vocabulary of features of subgraphs to support representation and reasoning.

View on arXiv PDF

Similar