LGFeb 20, 2025

Generating $π$-Functional Molecules Using STGG+ with Active Learning

arXiv:2502.14842v1h-index: 21Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of molecular discovery for materials science, though it is incremental as it builds on existing supervised learning methods.

The paper tackles generating novel π-functional molecules with out-of-distribution properties by integrating STGG+ with active learning, achieving high oscillator strength in generated molecules compared to reinforcement learning methods.

Generating novel molecules with out-of-distribution properties is a major challenge in molecular discovery. While supervised learning methods generate high-quality molecules similar to those in a dataset, they struggle to generalize to out-of-distribution properties. Reinforcement learning can explore new chemical spaces but often conducts 'reward-hacking' and generates non-synthesizable molecules. In this work, we address this problem by integrating a state-of-the-art supervised learning method, STGG+, in an active learning loop. Our approach iteratively generates, evaluates, and fine-tunes STGG+ to continuously expand its knowledge. We denote this approach STGG+AL. We apply STGG+AL to the design of organic $π$-functional materials, specifically two challenging tasks: 1) generating highly absorptive molecules characterized by high oscillator strength and 2) designing absorptive molecules with reasonable oscillator strength in the near-infrared (NIR) range. The generated molecules are validated and rationalized in-silico with time-dependent density functional theory. Our results demonstrate that our method is highly effective in generating novel molecules with high oscillator strength, contrary to existing methods such as reinforcement learning (RL) methods. We open-source our active-learning code along with our Conjugated-xTB dataset containing 2.9 million $π$-conjugated molecules and the function for approximating the oscillator strength and absorption wavelength (based on sTDA-xTB).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes