Structure-Aware Hard Negative Mining for Heterogeneous Graph Contrastive Learning
This work addresses the need for effective self-supervised learning in heterogeneous graphs, offering an incremental improvement over existing contrastive learning methods by focusing on negative sample hardness.
The paper tackles the label scarcity problem in heterogeneous graph analysis by proposing a structure-aware hard negative mining scheme for contrastive learning, which consistently outperforms existing state-of-the-art methods and even surpasses several supervised counterparts on three real-world datasets.
Recently, heterogeneous Graph Neural Networks (GNNs) have become a de facto model for analyzing HGs, while most of them rely on a relative large number of labeled data. In this work, we investigate Contrastive Learning (CL), a key component in self-supervised approaches, on HGs to alleviate the label scarcity problem. We first generate multiple semantic views according to metapaths and network schemas. Then, by pushing node embeddings corresponding to different semantic views close to each other (positives) and pulling other embeddings apart (negatives), one can obtain informative representations without human annotations. However, this CL approach ignores the relative hardness of negative samples, which may lead to suboptimal performance. Considering the complex graph structure and the smoothing nature of GNNs, we propose a structure-aware hard negative mining scheme that measures hardness by structural characteristics for HGs. By synthesizing more negative nodes, we give larger weights to harder negatives with limited computational overhead to further boost the performance. Empirical studies on three real-world datasets show the effectiveness of our proposed method. The proposed method consistently outperforms existing state-of-the-art methods and notably, even surpasses several supervised counterparts.