CHEM-PHLGNov 6, 2023

STRIDE: Structure-guided Generation for Inverse Design of Molecules

arXiv:2311.06297v12 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses data scarcity in industrial materials design, offering a practical solution for generating novel derivatives without retraining, though it is incremental as it builds on existing generative models.

The authors tackled the problem of generating novel molecules for materials like catalysts and antioxidants where large datasets are unavailable, by introducing STRIDE, a generative workflow that produces molecules with 21.7% lower synthetic accessibility scores and reduces ionization potential by 5.9% through guidance from known molecules.

Machine learning and especially deep learning has had an increasing impact on molecule and materials design. In particular, given the growing access to an abundance of high-quality small molecule data for generative modeling for drug design, results for drug discovery have been promising. However, for many important classes of materials such as catalysts, antioxidants, and metal-organic frameworks, such large datasets are not available. Such families of molecules with limited samples and structural similarities are especially prevalent for industrial applications. As is well-known, retraining and even fine-tuning are challenging on such small datasets. Novel, practically applicable molecules are most often derivatives of well-known molecules, suggesting approaches to addressing data scarcity. To address this problem, we introduce $\textbf{STRIDE}$, a generative molecule workflow that generates novel molecules with an unconditional generative model guided by known molecules without any retraining. We generate molecules outside of the training data from a highly specialized set of antioxidant molecules. Our generated molecules have on average 21.7% lower synthetic accessibility scores and also reduce ionization potential by 5.9% of generated molecules via guiding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes