TacoGFN: Target-conditioned GFlowNet for Structure-based Drug Design
This addresses the problem of efficient structure-based drug design for pharmaceutical researchers, offering a novel method that outperforms existing optimization-based approaches.
The paper tackled the challenge of generating drug-like molecules that bind to protein pockets by introducing TacoGFN, a target-conditioned GFlowNet approach, which achieved a state-of-the-art success rate of 56.0% and median Vina Dock score of -8.44 kcal/mol, with fine-tuning improving these to 88.8% and -10.93 kcal/mol while drastically reducing generation time.
Searching the vast chemical space for drug-like molecules that bind with a protein pocket is a challenging task in drug discovery. Recently, structure-based generative models have been introduced which promise to be more efficient by learning to generate molecules for any given protein structure. However, since they learn the distribution of a limited protein-ligand complex dataset, structure-based methods do not yet outperform optimization-based methods that generate binding molecules for just one pocket. To overcome limitations on data while leveraging learning across protein targets, we choose to model the reward distribution conditioned on pocket structure, instead of the training data distribution. We design TacoGFN, a novel GFlowNet-based approach for structure-based drug design, which can generate molecules conditioned on any protein pocket structure with probabilities proportional to its affinity and property rewards. In the generative setting for CrossDocked2020 benchmark, TacoGFN attains a state-of-the-art success rate of $56.0\%$ and $-8.44$ kcal/mol in median Vina Dock score while improving the generation time by multiple orders of magnitude. Fine-tuning TacoGFN further improves the median Vina Dock score to $-10.93$ kcal/mol and the success rate to $88.8\%$, outperforming all optimization-based methods.