LGOct 5, 2023
TacoGFN: Target-conditioned GFlowNet for Structure-based Drug DesignTony Shen, Seonghwan Seo, Grayson Lee et al.
Searching the vast chemical space for drug-like molecules that bind with a protein pocket is a challenging task in drug discovery. Recently, structure-based generative models have been introduced which promise to be more efficient by learning to generate molecules for any given protein structure. However, since they learn the distribution of a limited protein-ligand complex dataset, structure-based methods do not yet outperform optimization-based methods that generate binding molecules for just one pocket. To overcome limitations on data while leveraging learning across protein targets, we choose to model the reward distribution conditioned on pocket structure, instead of the training data distribution. We design TacoGFN, a novel GFlowNet-based approach for structure-based drug design, which can generate molecules conditioned on any protein pocket structure with probabilities proportional to its affinity and property rewards. In the generative setting for CrossDocked2020 benchmark, TacoGFN attains a state-of-the-art success rate of $56.0\%$ and $-8.44$ kcal/mol in median Vina Dock score while improving the generation time by multiple orders of magnitude. Fine-tuning TacoGFN further improves the median Vina Dock score to $-10.93$ kcal/mol and the success rate to $88.8\%$, outperforming all optimization-based methods.
69.1ROMar 14
Implicit Maximum Likelihood Estimation for Real-time Generative Model Predictive ControlGrayson Lee, Minh Bui, Shuzi Zhou et al.
Diffusion-based models have recently shown strong performance in trajectory planning, as they are capable of capturing diverse, multimodal distributions of complex behaviors. A key limitation of these models is their slow inference speed, which results from the iterative denoising process. This makes them less suitable for real-time applications such as closed-loop model predictive control (MPC), where plans must be generated quickly and adapted continuously to a changing environment. In this paper, we investigate Implicit Maximum Likelihood Estimation (IMLE) as an alternative generative modeling approach for planning. IMLE offers strong mode coverage while enabling inference that is two orders of magnitude faster, making it particularly well suited for real-time MPC tasks. Our results demonstrate that IMLE achieves competitive performance on standard offline reinforcement learning benchmarks compared to the standard diffusion-based planner, while substantially improving planning speed in both open-loop and closed-loop settings. We further validate IMLE in a closed-loop human navigation scenario, operating in real-time, demonstrating how it enables rapid and adaptive plan generation in dynamic environments.
LGJun 16, 2024
Geometric-informed GFlowNets for Structure-Based Drug DesignGrayson Lee, Tony Shen, Martin Ester
The rise of cost involved with drug discovery and current speed of which they are discover, underscore the need for more efficient structure-based drug design (SBDD) methods. We employ Generative Flow Networks (GFlowNets), to effectively explore the vast combinatorial space of drug-like molecules, which traditional virtual screening methods fail to cover. We introduce a novel modification to the GFlowNet framework by incorporating trigonometrically consistent embeddings, previously utilized in tasks involving protein conformation and protein-ligand interactions, to enhance the model's ability to generate molecules tailored to specific protein pockets. We have modified the existing protein conditioning used by GFlowNets, blending geometric information from both protein and ligand embeddings to achieve more geometrically consistent embeddings. Experiments conducted using CrossDocked2020 demonstrated an improvement in the binding affinity between generated molecules and protein pockets for both single and multi-objective tasks, compared to previous work. Additionally, we propose future work aimed at further increasing the geometric information captured in protein-ligand interactions.