Decoupled Self-supervised Learning for Non-Homophilous Graphs
This addresses the limitation of existing methods that assume homophily, which is often invalid in real-world graphs, making it incremental but important for domain-specific applications.
The paper tackles the problem of self-supervised learning for node representation on non-homophilous graphs, where linked nodes may not share similar features, by developing a decoupled self-supervised learning (DSSL) framework that achieves better performance compared to competitive baselines in extensive experiments.
This paper studies the problem of conducting self-supervised learning for node representation learning on graphs. Most existing self-supervised learning methods assume the graph is homophilous, where linked nodes often belong to the same class or have similar features. However, such assumptions of homophily do not always hold in real-world graphs. We address this problem by developing a decoupled self-supervised learning (DSSL) framework for graph neural networks. DSSL imitates a generative process of nodes and links from latent variable modeling of the semantic structure, which decouples different underlying semantics between different neighborhoods into the self-supervised learning process. Our DSSL framework is agnostic to the encoders and does not need prefabricated augmentations, thus is flexible to different graphs. To effectively optimize the framework, we derive the evidence lower bound of the self-supervised objective and develop a scalable training algorithm with variational inference. We provide a theoretical analysis to justify that DSSL enjoys the better downstream performance. Extensive experiments on various types of graph benchmarks demonstrate that our proposed framework can achieve better performance compared with competitive baselines.