SILGMLJun 7, 2017

Inductive Representation Learning on Large Graphs

arXiv:1706.02216v419791 citations
Originality Highly original
AI Analysis

This work addresses the limitation of graph representation learning methods that fail to generalize to new nodes, which is crucial for applications like recommendation systems and biological network analysis, representing a novel method for a known bottleneck.

The paper tackles the problem of generating node embeddings for unseen nodes in large graphs, which previous transductive methods could not do, by introducing GraphSAGE, an inductive framework that uses node features and local neighborhood aggregation. The result is improved performance on inductive node-classification benchmarks, such as citation and Reddit data, with generalization to unseen graphs like protein-protein interactions.

Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.

Code Implementations20 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes