LGAIBMJan 5, 2024

Graph-level Protein Representation Learning by Structure Knowledge Refinement

arXiv:2401.02713v1h-index: 19
Originality Incremental advance
AI Analysis

This work addresses challenges in graph-level representation learning for applications like protein structure feature extraction, offering an incremental improvement over existing GCL methods.

The paper tackles the problem of false negative pairs and weak adaptability of augmentation strategies in Graph Contrastive Learning (GCL) for unsupervised graph-level representation learning, proposing a Structure Knowledge Refinement (SKR) framework that uses data structure to determine pair probabilities and a compatible augmentation strategy, with experimental results showing superiority over most state-of-the-art baselines in graph-level classification tasks.

This paper focuses on learning representation on the whole graph level in an unsupervised manner. Learning graph-level representation plays an important role in a variety of real-world issues such as molecule property prediction, protein structure feature extraction, and social network analysis. The mainstream method is utilizing contrastive learning to facilitate graph feature extraction, known as Graph Contrastive Learning (GCL). GCL, although effective, suffers from some complications in contrastive learning, such as the effect of false negative pairs. Moreover, augmentation strategies in GCL are weakly adaptive to diverse graph datasets. Motivated by these problems, we propose a novel framework called Structure Knowledge Refinement (SKR) which uses data structure to determine the probability of whether a pair is positive or negative. Meanwhile, we propose an augmentation strategy that naturally preserves the semantic meaning of the original data and is compatible with our SKR framework. Furthermore, we illustrate the effectiveness of our SKR framework through intuition and experiments. The experimental results on the tasks of graph-level classification demonstrate that our SKR framework is superior to most state-of-the-art baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes