MolLingo: Molecule-Native Representations for LLM-Powered Scientific Agents
For researchers in drug discovery and molecular design, MolLingo demonstrates that LLMs can serve as effective design assistants when guided by chemically meaningful representations and structural context, addressing the lack of multi-agent coordination and shared memory in existing approaches.
MolLingo is a multi-agent system that automates molecular design by coordinating a Literature Agent, Chemist Agent, and Orchestrator with shared memory, using a novel BRICS-based Fragment Enumeration (BFE) method to represent molecules as block-based SMILES. It achieves a fourfold docking score improvement over GPT-5.4 and state-of-the-art results on TOMG-Bench, outperforming both frontier LLMs and RL-based optimization methods.
We present MolLingo, a multi-agent system that emulates the reasoning process of a chemist to automate molecular design. Existing LLM-based approaches either operate as standalone generative models without access to external tools or lack the multi-agent coordination and shared memory needed for iterative, evidence-driven reasoning across the molecular design pipeline. MolLingo addresses this by coordinating a Literature Agent, a Chemist Agent, and an Orchestrator through a shared memory module, with each agent equipped with domain-specific tools. To enable effective molecular reasoning, we introduce BRICS-based Fragment Enumeration (BFE), a synthesis-aware molecular fragmentation method that decomposes molecules into chemically meaningful building blocks represented as block-based SMILES paired with common chemical names. This representation bridges molecular structure and LLM semantic space, enabling block-level reasoning and editing that is difficult with raw SMILES alone. As a case study in early-stage therapeutic design, MolLingo further grounds the Chemist Agent's reasoning in binding site geometry and residue-level protein context derived from molecular docking to optimize molecules for stronger target binding. Across four benchmarks, MolLingo consistently outperforms frontier LLMs and specialized baselines, including a fourfold docking score improvement over GPT-5.4 despite using the same underlying model, consistent drug property optimization gains across multiple LLM backbones, and state-of-the-art results on TOMG-Bench, surpassing both frontier LLMs and the RL-based optimization method RePO. Our results suggest that LLMs are already capable molecular design assistants when guided through chemically meaningful representations and biologically grounded structural context. Code is available at: https://anonymous.4open.science/status/MolLingo-7450.