SILGNov 26, 2019

BHIN2vec: Balancing the Type of Relation in Heterogeneous Information Network

arXiv:1912.08925v120 citations
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in heterogeneous information networks for researchers and practitioners in network analysis, offering an incremental improvement over existing embedding techniques.

The paper tackles the imbalance issue in heterogeneous network embedding by proposing BHIN2vec, a method that balances training across relation types using a novel random-walk strategy based on relative training ratios, achieving superior performance in node classification and recommendation tasks compared to state-of-the-art methods.

The goal of network embedding is to transform nodes in a network to a low-dimensional embedding vectors. Recently, heterogeneous network has shown to be effective in representing diverse information in data. However, heterogeneous network embedding suffers from the imbalance issue, i.e. the size of relation types (or the number of edges in the network regarding the type) is imbalanced. In this paper, we devise a new heterogeneous network embedding method, called BHIN2vec, which considers the balance among all relation types in a network. We view the heterogeneous network embedding as simultaneously solving multiple tasks in which each task corresponds to each relation type in a network. After splitting the skip-gram loss into multiple losses corresponding to different tasks, we propose a novel random-walk strategy to focus on the tasks with high loss values by considering the relative training ratio. Unlike previous random walk strategies, our proposed random-walk strategy generates training samples according to the relative training ratio among different tasks, which results in a balanced training for the node embedding. Our extensive experiments on node classification and recommendation demonstrate the superiority of BHIN2vec compared to the state-of-the-art methods. Also, based on the relative training ratio, we analyze how much each relation type is represented in the embedding space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes