CHEM-PHLGNov 15, 2025

Chemistry-Enhanced Diffusion-Based Framework for Small-to-Large Molecular Conformation Generation

arXiv:2511.12182v1h-index: 13
Originality Highly original
AI Analysis

This addresses the problem of high computational effort in predicting large-molecule structures for researchers in chemistry and drug discovery, offering a scalable and transferable solution.

The paper tackles the challenge of generating 3D conformations for large molecules by introducing StoL, a diffusion-based framework that assembles molecules from small fragments without large-molecule training data, achieving rapid generation and broad configurational coverage as confirmed against DFT calculations.

Obtaining 3D conformations of realistic polyatomic molecules at the quantum chemistry level remains challenging, and although recent machine learning advances offer promise, predicting large-molecule structures still requires substantial computational effort. Here, we introduce StoL, a diffusion model-based framework that enables rapid and knowledge-free generation of large molecular structures from small-molecule data. Remarkably, StoL assembles molecules in a LEGO-style fashion from scratch, without seeing the target molecules or any structures of comparable size during training. Given a SMILES input, it decomposes the molecule into chemically valid fragments, generates their 3D structures with a diffusion model trained on small molecules, and assembles them into diverse conformations. This fragment-based strategy eliminates the need for large-molecule training data while maintaining high scalability and transferability. By embedding chemical principles into key steps, StoL ensures faster convergence, chemically rational structures, and broad configurational coverage, as confirmed against DFT calculations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes