AIJun 4

Agentic Molecular Recovery via Molecule-Aware Exploration

arXiv:2606.0584756.6
Predicted impact top 60% in AI · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in molecular generation, this work addresses the problem of invalid SMILES generation by proposing a molecule-aware recovery method that preserves target-relevant structures.

AMREC achieves the strongest overall recovery profile across structural, exact-match, and string-level metrics for text-guided molecular generation, outperforming existing correction strategies on invalid ChEBI-20 drafts.

Text-guided molecular generation with LLMs often yields invalid SMILES. We argue that invalid drafts should be addressed through a shift from validity-oriented repair to identity-preserving molecular recovery: the objective is not only to restore chemical validity, but also to preserve target-relevant structural cues and recover the molecular identity implied by the description. This perspective reveals the limitations of existing correction strategies. Post-hoc repair can recover validity while distorting key structures, LLM-only correction can introduce unintended global drift, and generic agentic correction remains constrained by greedy single-candidate trajectories even when equipped with executable RDKit edit tools. To address these limitations, we propose AMREC, which couples molecule-aware mismatch tracking with expanded candidate exploration and trajectory-level selection. On invalid ChEBI-20 drafts from three backbone models, AMREC achieves the strongest overall recovery profile across structural, exact-match, and string-level metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes