LGMLNov 23, 2019

CORE: Automatic Molecule Optimization Using Copy & Refine Strategy

arXiv:1912.05910v176 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of optimizing molecules for drug discovery, particularly for infrequent substructures, but is incremental as it builds on existing methods.

The paper tackled the problem of inaccurate iterative prediction in molecule optimization by proposing the CORE strategy, which decides whether to copy substructures from input or generate new ones, resulting in up to 21% relative improvement in success rates over baselines on infrequent substructures.

Molecule optimization is about generating molecule $Y$ with more desirable properties based on an input molecule $X$. The state-of-the-art approaches partition the molecules into a large set of substructures $S$ and grow the new molecule structure by iteratively predicting which substructure from $S$ to add. However, since the set of available substructures $S$ is large, such an iterative prediction task is often inaccurate especially for substructures that are infrequent in the training data. To address this challenge, we propose a new generating strategy called "Copy & Refine" (CORE), where at each step the generator first decides whether to copy an existing substructure from input $X$ or to generate a new substructure, then the most promising substructure will be added to the new molecule. Combining together with scaffolding tree generation and adversarial training, CORE can significantly improve several latest molecule optimization methods in various measures including drug likeness (QED), dopamine receptor (DRD2) and penalized LogP. We tested CORE and baselines using the ZINC database and CORE obtained up to 11% and 21% relatively improvement over the baselines on success rate on the complete test set and the subset with infrequent substructures, respectively.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes