HybridLinker: Topology-Guided Posterior Sampling for Enhanced Diversity and Validity in 3D Molecular Linker Generation
This addresses a critical problem in drug discovery for lead optimization and PROTAC design by enhancing molecular linker generation, though it is incremental as it builds on existing methods without new training.
The paper tackled the trade-off between diversity and validity in 3D molecular linker generation by proposing HybridLinker, which uses a diffusion posterior sampling method to combine point cloud-free and point cloud-aware models, resulting in significant improvements in both validity and diversity over baselines.
Linker generation is critical in drug discovery applications such as lead optimization and PROTAC design, where molecular fragments are assembled into diverse drug candidates via molecular linker. Existing methods fall into point cloud-free and point cloud-aware categories based on their use of fragments' 3D poses alongside their topologies in sampling the linker's topology. Point cloud-free models prioritize sample diversity but suffer from lower validity due to overlooking fragments' spatial constraints, while point cloud-aware models ensure higher validity but restrict diversity by enforcing strict spatial constraints. To overcome these trade-offs without additional training, we propose HybridLinker, a framework that enhances point cloud-aware inference by providing diverse bonding topologies from a pretrained point cloud-free model as guidance. At its core, we propose LinkerDPS, the first diffusion posterior sampling (DPS) method operating across point cloud-free and point cloud-aware spaces, bridging molecular topology with 3D point clouds via an energy-inspired function. By transferring the diverse sampling distribution of point cloud-free models into the point cloud-aware distribution, HybridLinker significantly surpasses baselines, improving both validity and diversity in foundational molecular design and applied drug optimization tasks, establishing a new DPS framework in the molecular domains beyond imaging.