LGBMSep 1, 2023

Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction

arXiv:2309.00483v16 citations
Originality Highly original
AI Analysis

This work addresses the problem of limited labeled data for molecular property prediction in chemistry and drug discovery, representing an incremental improvement by combining existing modalities with novel pre-training tasks.

The paper tackled molecular property prediction by proposing a self-supervised learning framework that integrates 2D topological and 3D geometric information, resulting in Galformer outperforming six state-of-the-art baselines on twelve benchmarks.

Molecular property prediction with deep learning has gained much attention over the past years. Owing to the scarcity of labeled molecules, there has been growing interest in self-supervised learning methods that learn generalizable molecular representations from unlabeled data. Molecules are typically treated as 2D topological graphs in modeling, but it has been discovered that their 3D geometry is of great importance in determining molecular functionalities. In this paper, we propose the Geometry-aware line graph transformer (Galformer) pre-training, a novel self-supervised learning framework that aims to enhance molecular representation learning with 2D and 3D modalities. Specifically, we first design a dual-modality line graph transformer backbone to encode the topological and geometric information of a molecule. The designed backbone incorporates effective structural encodings to capture graph structures from both modalities. Then we devise two complementary pre-training tasks at the inter and intra-modality levels. These tasks provide properly supervised information and extract discriminative 2D and 3D knowledge from unlabeled molecules. Finally, we evaluate Galformer against six state-of-the-art baselines on twelve property prediction benchmarks via downstream fine-tuning. Experimental results show that Galformer consistently outperforms all baselines on both classification and regression tasks, demonstrating its effectiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes