LGAIMNQMMLMay 31, 2019

MolecularRNN: Generating realistic molecular graphs with optimized properties

arXiv:1905.13372v1183 citations
Originality Highly original
AI Analysis

This addresses the need for de-novo molecular design in drug discovery, representing a strong specific gain with competitive benchmarks.

The authors tackled the problem of generating new molecules with specific properties for drug discovery by developing MolecularRNN, a graph recurrent generative model that, after pretraining and tuning with policy gradient, significantly shifted distributions toward desired ranges for properties like lipophilicity and achieved 100% validity with rejection sampling.

Designing new molecules with a set of predefined properties is a core problem in modern drug discovery and development. There is a growing need for de-novo design methods that would address this problem. We present MolecularRNN, the graph recurrent generative model for molecular structures. Our model generates diverse realistic molecular graphs after likelihood pretraining on a big database of molecules. We perform an analysis of our pretrained models on large-scale generated datasets of 1 million samples. Further, the model is tuned with policy gradient algorithm, provided a critic that estimates the reward for the property of interest. We show a significant distribution shift to the desired range for lipophilicity, drug-likeness, and melting point outperforming state-of-the-art works. With the use of rejection sampling based on valency constraints, our model yields 100% validity. Moreover, we show that invalid molecules provide a rich signal to the model through the use of structure penalty in our reinforcement learning pipeline.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes