Learning Graph Models for Retrosynthesis Prediction
This addresses a fundamental problem in organic synthesis for chemists by providing an interpretable and correctable model, though it is incremental as it builds on existing graph-based approaches.
The paper tackles retrosynthesis prediction by introducing a graph-based model that decomposes the task into predicting graph edits to form synthons and then expanding them into complete molecules, achieving a top-1 accuracy of 53.7% and outperforming previous methods.
Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule. A key consideration in building neural models for this task is aligning model design with strategies adopted by chemists. Building on this viewpoint, this paper introduces a graph-based approach that capitalizes on the idea that the graph topology of precursor molecules is largely unaltered during a chemical reaction. The model first predicts the set of graph edits transforming the target into incomplete molecules called synthons. Next, the model learns to expand synthons into complete molecules by attaching relevant leaving groups. This decomposition simplifies the architecture, making its predictions more interpretable, and also amenable to manual correction. Our model achieves a top-1 accuracy of $53.7\%$, outperforming previous template-free and semi-template-based methods.