Molecular Attributes Transfer from Non-Parallel Data
This addresses the problem of optimizing chemical molecules for drug development without needing predefined attribute functions or parallel data, though it is incremental in applying style transfer to this domain.
The paper tackles molecular optimization by formulating it as a style transfer problem, using a novel generative model that learns from non-parallel data and outperforms state-of-the-art methods in tasks like toxicity modification and synthesizability improvement.
Optimizing chemical molecules for desired properties lies at the core of drug development. Despite initial successes made by deep generative models and reinforcement learning methods, these methods were mostly limited by the requirement of predefined attribute functions or parallel data with manually pre-compiled pairs of original and optimized molecules. In this paper, for the first time, we formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data through adversarial training strategies. Our model further enables both preservation of molecular contents and optimization of molecular properties through combining auxiliary guided-variational autoencoders and generative flow techniques. Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.