AIMay 29, 2025

Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages

arXiv:2505.22948v15 citationsh-index: 6Has CodeICML
Originality Incremental advance
AI Analysis

This addresses the need for interpretable and efficient molecular discovery workflows, though it appears incremental as it builds on prior grammar learning approaches.

The paper tackles the problem of unreliable grammar learning in molecular generation by proposing Foundation Molecular Grammar (FMG), which uses multi-modal foundation models to induce an interpretable molecular language, resulting in improved synthesizability, diversity, and data efficiency.

Recent data-efficient molecular generation approaches exploit graph grammars to introduce interpretability into the generative models. However, grammar learning therein relies on expert annotation or unreliable heuristics for algorithmic inference. We propose Foundation Molecular Grammar (FMG), which leverages multi-modal foundation models (MMFMs) to induce an interpretable molecular language. By exploiting the chemical knowledge of an MMFM, FMG renders molecules as images, describes them as text, and aligns information across modalities using prompt learning. FMG can be used as a drop-in replacement for the prior grammar learning approaches in molecular generation and property prediction. We show that FMG not only excels in synthesizability, diversity, and data efficiency but also offers built-in chemical interpretability for automated molecular discovery workflows. Code is available at https://github.com/shiningsunnyday/induction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes