LGSIMLJul 22, 2020

Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks

arXiv:2007.11192v369 citations
AI Analysis

This addresses the need for task-specific contextual embeddings in link prediction for heterogeneous networks, offering an incremental improvement over static and contextual methods.

The paper tackled the problem of link prediction in heterogeneous networks by developing SLiCE, a framework that learns contextual node representations through self-supervised pre-training and fine-tuning, which significantly outperformed existing methods on benchmark datasets.

Representation learning methods for heterogeneous networks produce a low-dimensional vector embedding for each node that is typically fixed for all tasks involving the node. Many of the existing methods focus on obtaining a static vector representation for a node in a way that is agnostic to the downstream application where it is being used. In practice, however, downstream tasks such as link prediction require specific contextual information that can be extracted from the subgraphs related to the nodes provided as input to the task. To tackle this challenge, we develop SLiCE, a framework bridging static representation learning methods using global information from the entire graph with localized attention driven mechanisms to learn contextual node representations. We first pre-train our model in a self-supervised manner by introducing higher-order semantic associations and masking nodes, and then fine-tune our model for a specific link prediction task. Instead of training node representations by aggregating information from all semantic neighbors connected via metapaths, we automatically learn the composition of different metapaths that characterize the context for a specific task without the need for any pre-defined metapaths. SLiCE significantly outperforms both static and contextual embedding learning methods on several publicly available benchmark network datasets. We also interpret the semantic association matrix and provide its utility and relevance in making successful link predictions between heterogeneous nodes in the network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes