EvoEGF-Mol: Evolving Exponential Geodesic Flow for Structure-based Drug Design
This work addresses the problem of generating bioactive ligands with high geometric precision for drug discovery, representing a novel method for a known bottleneck.
The paper tackled the mismatch between conventional probability paths and statistical manifolds in structure-based drug design by modeling molecules as composite exponential-family distributions and using generative flows along exponential geodesics, achieving a 93.4% PoseBusters passing rate on CrossDock and outperforming baselines on MolGenBench tasks.
Structure-Based Drug Design (SBDD) aims to discover bioactive ligands. Conventional approaches construct probability paths separately in Euclidean and probabilistic spaces for continuous atomic coordinates and discrete chemical categories, leading to a mismatch with the underlying statistical manifolds. We address this issue from an information-geometric perspective by modeling molecules as composite exponential-family distributions and defining generative flows along exponential geodesics under the Fisher-Rao metric. To avoid the instantaneous trajectory collapse induced by geodesics directly targeting Dirac distributions, we propose Evolving Exponential Geodesic Flow for SBDD (EvoEGF-Mol), which replaces static Dirac targets with dynamically concentrating distributions, ensuring stable training via a progressive-parameter-refinement architecture. Our model approaches a reference-level PoseBusters passing rate (93.4%) on CrossDock, demonstrating remarkable geometric precision and interaction fidelity, while outperforming baselines on real-world MolGenBench tasks by recovering bioactive scaffolds and generating candidates that meet established MedChem filters.