AILGSIJul 10, 2023

Source-Aware Embedding Training on Heterogeneous Information Networks

arXiv:2307.04336v16 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in graph embedding for heterogeneous information networks, with potential applications in recommendation systems and social networks, but it is incremental as it builds on existing HIN representation learning methods.

The paper tackles the problem of distribution discrepancy among subgraphs from multiple sources in heterogeneous information networks, which hinders embedding effectiveness, and proposes SUMSHINE, a scalable unsupervised framework that aligns embedding distributions and outperforms state-of-the-art methods in downstream tasks.

Heterogeneous information networks (HINs) have been extensively applied to real-world tasks, such as recommendation systems, social networks, and citation networks. While existing HIN representation learning methods can effectively learn the semantic and structural features in the network, little awareness was given to the distribution discrepancy of subgraphs within a single HIN. However, we find that ignoring such distribution discrepancy among subgraphs from multiple sources would hinder the effectiveness of graph embedding learning algorithms. This motivates us to propose SUMSHINE (Scalable Unsupervised Multi-Source Heterogeneous Information Network Embedding) -- a scalable unsupervised framework to align the embedding distributions among multiple sources of an HIN. Experimental results on real-world datasets in a variety of downstream tasks validate the performance of our method over the state-of-the-art heterogeneous information network embedding algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes