LGAIBMDec 6, 2023

Molecule Joint Auto-Encoding: Trajectory Pretraining with 2D and 3D Diffusion

arXiv:2312.03475v114 citationsh-index: 4NIPS
Originality Incremental advance
AI Analysis

This addresses the problem of improving machine learning for drug discovery by enhancing molecule geometry representation, though it appears incremental as it builds on existing diffusion and auto-encoding techniques.

The paper tackled the bottleneck of molecule geometrical representation in drug discovery by proposing MoleculeJAE, a pretraining method that learns 2D and 3D information via diffusion, resulting in state-of-the-art performance on 15 out of 20 tasks compared to 12 baselines.

Recently, artificial intelligence for drug discovery has raised increasing interest in both machine learning and chemistry domains. The fundamental building block for drug discovery is molecule geometry and thus, the molecule's geometrical representation is the main bottleneck to better utilize machine learning techniques for drug discovery. In this work, we propose a pretraining method for molecule joint auto-encoding (MoleculeJAE). MoleculeJAE can learn both the 2D bond (topology) and 3D conformation (geometry) information, and a diffusion process model is applied to mimic the augmented trajectories of such two modalities, based on which, MoleculeJAE will learn the inherent chemical structure in a self-supervised manner. Thus, the pretrained geometrical representation in MoleculeJAE is expected to benefit downstream geometry-related tasks. Empirically, MoleculeJAE proves its effectiveness by reaching state-of-the-art performance on 15 out of 20 tasks by comparing it with 12 competitive baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes