MotifRetro: Exploring the Combinability-Consistency Trade-offs in retrosynthesis via Dynamic Motif Editing
This work addresses a key challenge in retrosynthesis prediction for chemistry and drug discovery, offering a unified framework that improves predictive accuracy.
The paper tackled the problem of balancing combinability and consistency in graph-based retrosynthesis prediction by proposing MotifRetro, a dynamic motif editing framework that explores the entire trade-off space and achieves state-of-the-art performance on the USPTO-50K dataset.
Is there a unified framework for graph-based retrosynthesis prediction? Through analysis of full-, semi-, and non-template retrosynthesis methods, we discovered that they strive to strike an optimal balance between combinability and consistency: \textit{Should atoms be combined as motifs to simplify the molecular editing process, or should motifs be broken down into atoms to reduce the vocabulary and improve predictive consistency?} Recent works have studied several specific cases, while none of them explores different combinability-consistency trade-offs. Therefore, we propose MotifRetro, a dynamic motif editing framework for retrosynthesis prediction that can explore the entire trade-off space and unify graph-based models. MotifRetro comprises two components: RetroBPE, which controls the combinability-consistency trade-off, and a motif editing model, where we introduce a novel LG-EGAT module to dynamiclly add motifs to the molecule. We conduct extensive experiments on USPTO-50K to explore how the trade-off affects the model performance and finally achieve state-of-the-art performance.