LGAINov 11, 2025

Hierarchical Structure-Property Alignment for Data-Efficient Molecular Generation and Editing

arXiv:2511.08080v1h-index: 7
AI Analysis

This addresses data efficiency and property alignment issues in molecular generation and editing for drug discovery, representing an incremental improvement over existing methods.

The paper tackles the challenge of capturing complex relationships between molecular structures and multiple properties for AI-driven drug discovery, proposing HSPAG, a data-efficient framework that learns hierarchical structure-property alignment and reduces required pre-training data, with experiments showing it captures fine-grained relationships and supports controllable generation under multiple property constraints.

Property-constrained molecular generation and editing are crucial in AI-driven drug discovery but remain hindered by two factors: (i) capturing the complex relationships between molecular structures and multiple properties remains challenging, and (ii) the narrow coverage and incomplete annotations of molecular properties weaken the effectiveness of property-based models. To tackle these limitations, we propose HSPAG, a data-efficient framework featuring hierarchical structure-property alignment. By treating SMILES and molecular properties as complementary modalities, the model learns their relationships at atom, substructure, and whole-molecule levels. Moreover, we select representative samples through scaffold clustering and hard samples via an auxiliary variational auto-encoder (VAE), substantially reducing the required pre-training data. In addition, we incorporate a property relevance-aware masking mechanism and diversified perturbation strategies to enhance generation quality under sparse annotations. Experiments demonstrate that HSPAG captures fine-grained structure-property relationships and supports controllable generation under multiple property constraints. Two real-world case studies further validate the editing capabilities of HSPAG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes