BMLGMar 13, 2023

Molecular Property Prediction by Semantic-invariant Contrastive Learning

arXiv:2303.06902v110 citationsh-index: 52
Originality Highly original
AI Analysis

This work addresses a key bottleneck in AI-aided drug design by improving molecular representation learning for more accurate property prediction, though it is incremental as it builds on existing contrastive learning frameworks.

The paper tackles the semantic inconsistency problem in contrastive learning for molecular property prediction by proposing a semantic-invariant view generation method using fragment pairs, resulting in state-of-the-art performance with the least number of pre-training samples on various benchmark datasets.

Contrastive learning have been widely used as pretext tasks for self-supervised pre-trained molecular representation learning models in AI-aided drug design and discovery. However, exiting methods that generate molecular views by noise-adding operations for contrastive learning may face the semantic inconsistency problem, which leads to false positive pairs and consequently poor prediction performance. To address this problem, in this paper we first propose a semantic-invariant view generation method by properly breaking molecular graphs into fragment pairs. Then, we develop a Fragment-based Semantic-Invariant Contrastive Learning (FraSICL) model based on this view generation method for molecular property prediction. The FraSICL model consists of two branches to generate representations of views for contrastive learning, meanwhile a multi-view fusion and an auxiliary similarity loss are introduced to make better use of the information contained in different fragment-pair views. Extensive experiments on various benchmark datasets show that with the least number of pre-training samples, FraSICL can achieve state-of-the-art performance, compared with major existing counterpart models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes