QMAISep 15, 2020

Scaffold-constrained molecular generation

arXiv:2009.07778v357 citations
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in lead optimization for drug discovery by enabling scaffold-constrained molecular generation, though it is incremental as it builds on existing RNN methods.

The paper tackles the problem of generating molecules with specific scaffold constraints in drug discovery, introducing a modified SMILES-based RNN algorithm that successfully designs novel molecules for tasks like DRD2 and MMP-12 series, achieving practical generation rates.

One of the major applications of generative models for drug Discovery targets the lead-optimization phase. During the optimization of a lead series, it is common to have scaffold constraints imposed on the structure of the molecules designed. Without enforcing such constraints, the probability of generating molecules with the required scaffold is extremely low and hinders the practicality of generative models for de-novo drug design. To tackle this issue, we introduce a new algorithm to perform scaffold-constrained in-silico molecular design. We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation. We directly benefit from the associated reinforcement Learning methods, allowing to design molecules optimized for different properties while exploring only the relevant chemical space. We showcase the method's ability to perform scaffold-constrained generation on various tasks: designing novel molecules around scaffolds extracted from SureChEMBL chemical series, generating novel active molecules on the Dopamine Receptor D2 (DRD2) target, and, finally, designing predicted actives on the MMP-12 series, an industrial lead-optimization project.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes