LGSep 27, 2022

MARS: A Motif-based Autoregressive Model for Retrosynthesis Prediction

arXiv:2209.13178v121 citationsh-index: 84
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in drug discovery by improving retrosynthesis prediction, though it is incremental as it builds on existing graph-generation approaches.

The paper tackles the problem of retrosynthesis prediction for drug discovery by proposing a motif-based autoregressive model that sequentially identifies reaction centers, generates synthons, and adds motifs to produce reactants, achieving significant outperformance over previous state-of-the-art algorithms on a benchmark dataset.

Retrosynthesis is a major task for drug discovery. It is formulated as a graph-generating problem by many existing approaches. Specifically, these methods firstly identify the reaction center, and break target molecule accordingly to generate synthons. Reactants are generated by either adding atoms sequentially to synthon graphs or directly adding proper leaving groups. However, both two strategies suffer since adding atoms results in a long prediction sequence which increases generation difficulty, while adding leaving groups can only consider the ones in the training set which results in poor generalization. In this paper, we propose a novel end-to-end graph generation model for retrosynthesis prediction, which sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants. Since chemically meaningful motifs are bigger than atoms and smaller than leaving groups, our method enjoys lower prediction complexity than adding atoms and better generalization than adding leaving groups. Experiments on a benchmark dataset show that the proposed model significantly outperforms previous state-of-the-art algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes