LGAIFeb 1, 2022

Scalable Fragment-Based 3D Molecular Design with Reinforcement Learning

arXiv:2202.00658v115 citations
Originality Incremental advance
AI Analysis

This addresses the need for scalable 3D molecular design in fields like drug discovery, offering a novel method that builds on human knowledge but is incremental in its approach.

The paper tackles the problem of generating 3D molecular structures by introducing a reinforcement learning framework that uses molecular fragments instead of individual atoms, enabling efficient design of complex molecules with over 100 atoms across various distributions like drug-like and biomolecules.

Machine learning has the potential to automate molecular design and drastically accelerate the discovery of new functional compounds. Towards this goal, generative models and reinforcement learning (RL) using string and graph representations have been successfully used to search for novel molecules. However, these approaches are limited since their representations ignore the three-dimensional (3D) structure of molecules. In fact, geometry plays an important role in many applications in inverse molecular design, especially in drug discovery. Thus, it is important to build models that can generate molecular structures in 3D space based on property-oriented geometric constraints. To address this, one approach is to generate molecules as 3D point clouds by sequentially placing atoms at locations in space -- this allows the process to be guided by physical quantities such as energy or other properties. However, this approach is inefficient as placing individual atoms makes the exploration unnecessarily deep, limiting the complexity of molecules that can be generated. Moreover, when optimizing a molecule, organic and medicinal chemists use known fragments and functional groups, not single atoms. We introduce a novel RL framework for scalable 3D design that uses a hierarchical agent to build molecules by placing molecular substructures sequentially in 3D space, thus attempting to build on the existing human knowledge in the field of molecular design. In a variety of experiments with different substructures, we show that our agent, guided only by energy considerations, can efficiently learn to produce molecules with over 100 atoms from many distributions including drug-like molecules, organic LED molecules, and biomolecules.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes