BMLGCHEM-PHMar 14, 2024

Leap: molecular synthesisability scoring with intermediates

arXiv:2403.13005v2
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in drug discovery by enabling more accurate filtering of viable compounds, though it is incremental as it builds on existing synthesis scoring methods with a novel conditioning approach.

The paper tackled the problem of dynamically assessing molecular synthesisability by conditioning on available intermediates, and showed that their method, Leap, surpasses existing scoring methods by at least 5% in AUC score for identifying synthesizable molecules.

Assessing whether a molecule can be synthesised is a primary task in drug discovery. It enables computational chemists to filter for viable compounds or bias molecular generative models. The notion of synthesisability is dynamic as it evolves depending on the availability of key compounds. A common approach in drug discovery involves exploring the chemical space surrounding synthetically-accessible intermediates. This strategy improves the synthesisability of the derived molecules due to the availability of key intermediates. Existing synthesisability scoring methods such as SAScore, SCScore and RAScore, cannot condition on intermediates dynamically. Our approach, Leap, is a GPT-2 model trained on the depth, or longest linear path, of predicted synthesis routes that allows information on the availability of key intermediates to be included at inference time. We show that Leap surpasses all other scoring methods by at least 5% on AUC score when identifying synthesisable molecules, and can successfully adapt predicted scores when presented with a relevant intermediate compound.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes