LGAISep 13, 2024

Molecular Graph Representation Learning via Structural Similarity Information

arXiv:2409.08580v12 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This addresses the need for more accurate property prediction in molecular science by capturing global structural similarities, though it appears incremental as it builds on existing GNN methods.

The paper tackles the problem of enhancing molecular graph representation learning by incorporating structural similarity information between molecules, which is often overlooked, and demonstrates that their MSSM-GNN model consistently outperforms eleven state-of-the-art baselines on molecular datasets.

Graph Neural Networks (GNNs) have been widely employed for feature representation learning in molecular graphs. Therefore, it is crucial to enhance the expressiveness of feature representation to ensure the effectiveness of GNNs. However, a significant portion of current research primarily focuses on the structural features within individual molecules, often overlooking the structural similarity between molecules, which is a crucial aspect encapsulating rich information on the relationship between molecular properties and structural characteristics. Thus, these approaches fail to capture the rich semantic information at the molecular structure level. To bridge this gap, we introduce the \textbf{Molecular Structural Similarity Motif GNN (MSSM-GNN)}, a novel molecular graph representation learning method that can capture structural similarity information among molecules from a global perspective. In particular, we propose a specially designed graph that leverages graph kernel algorithms to represent the similarity between molecules quantitatively. Subsequently, we employ GNNs to learn feature representations from molecular graphs, aiming to enhance the accuracy of property prediction by incorporating additional molecular representation information. Finally, through a series of experiments conducted on both small-scale and large-scale molecular datasets, we demonstrate that our model consistently outperforms eleven state-of-the-art baselines. The codes are available at https://github.com/yaoyao-yaoyao-cell/MSSM-GNN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes