LGBMJun 15, 2024

Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

arXiv:2406.10513v16 citations
Originality Highly original
AI Analysis

This work addresses the problem of generating realistic 3D molecular structures for drug discovery, representing a novel method for a known bottleneck.

The authors tackled molecular graph generation by mapping graphs to Euclidean point clouds and using an equivariant diffusion model, achieving state-of-the-art performance with over 30% improvement on ZINC250K and 16% on GuacaMol datasets.

We introduce a new framework for molecular graph generation with 3D molecular generative models. Our Synthetic Coordinate Embedding (SyCo) framework maps molecular graphs to Euclidean point clouds via synthetic conformer coordinates and learns the inverse map using an E(n)-Equivariant Graph Neural Network (EGNN). The induced point cloud-structured latent space is well-suited to apply existing 3D molecular generative models. This approach simplifies the graph generation problem - without relying on molecular fragments nor autoregressive decoding - into a point cloud generation problem followed by node and edge classification tasks. Further, we propose a novel similarity-constrained optimization scheme for 3D diffusion models based on inpainting and guidance. As a concrete implementation of our framework, we develop EDM-SyCo based on the E(3) Equivariant Diffusion Model (EDM). EDM-SyCo achieves state-of-the-art performance in distribution learning of molecular graphs, outperforming the best non-autoregressive methods by more than 30% on ZINC250K and 16% on the large-scale GuacaMol dataset while improving conditional generation by up to 3.9 times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes