BMLGQMMay 10, 2023

Augmented Memory: Capitalizing on Experience Replay to Accelerate De Novo Molecular Design

arXiv:2305.16160v115 citations
Originality Incremental advance
AI Analysis

This addresses the problem of costly oracle evaluations in molecular design for researchers, offering incremental improvements in sample efficiency.

The paper tackled the sample efficiency challenge in de novo molecular design by proposing Augmented Memory, which combines data augmentation with experience replay to reuse oracle scores, achieving a new state-of-the-art on the PMO benchmark by outperforming the previous best method on 19 out of 23 tasks.

Sample efficiency is a fundamental challenge in de novo molecular design. Ideally, molecular generative models should learn to satisfy a desired objective under minimal oracle evaluations (computational prediction or wet-lab experiment). This problem becomes more apparent when using oracles that can provide increased predictive accuracy but impose a significant cost. Consequently, these oracles cannot be directly optimized under a practical budget. Molecular generative models have shown remarkable sample efficiency when coupled with reinforcement learning, as demonstrated in the Practical Molecular Optimization (PMO) benchmark. Here, we propose a novel algorithm called Augmented Memory that combines data augmentation with experience replay. We show that scores obtained from oracle calls can be reused to update the model multiple times. We compare Augmented Memory to previously proposed algorithms and show significantly enhanced sample efficiency in an exploitation task and a drug discovery case study requiring both exploration and exploitation. Our method achieves a new state-of-the-art in the PMO benchmark which enforces a computational budget, outperforming the previous best performing method on 19/23 tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes