LGNov 18, 2024

Molecule Generation with Fragment Retrieval Augmentation

arXiv:2411.12078v128 citationsh-index: 26NIPS
Originality Incremental advance
AI Analysis

This is an incremental improvement for drug discovery, enhancing molecule generation by expanding fragment diversity.

The paper tackles the limited exploration in fragment-based molecule generation by proposing Fragment Retrieval-Augmented Generation (f-RAG), which retrieves and integrates hard and soft fragments to generate new molecules, achieving an improved exploration-exploitation trade-off.

Fragment-based drug discovery, in which molecular fragments are assembled into new molecules with desirable biochemical properties, has achieved great success. However, many fragment-based molecule generation methods show limited exploration beyond the existing fragments in the database as they only reassemble or slightly modify the given ones. To tackle this problem, we propose a new fragment-based molecule generation framework with retrieval augmentation, namely Fragment Retrieval-Augmented Generation (f-RAG). f-RAG is based on a pre-trained molecular generative model that proposes additional fragments from input fragments to complete and generate a new molecule. Given a fragment vocabulary, f-RAG retrieves two types of fragments: (1) hard fragments, which serve as building blocks that will be explicitly included in the newly generated molecule, and (2) soft fragments, which serve as reference to guide the generation of new fragments through a trainable fragment injection module. To extrapolate beyond the existing fragments, f-RAG updates the fragment vocabulary with generated fragments via an iterative refinement process which is further enhanced with post-hoc genetic fragment modification. f-RAG can achieve an improved exploration-exploitation trade-off by maintaining a pool of fragments and expanding it with novel and high-quality fragments through a strong generative prior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes