LG AI QMDec 6, 2024

Chemist-aligned retrosynthesis by ensembling diverse inductive bias models

Krzysztof Maziarz, Guoqing Liu, Hubert Misztela, Austin Tripp, Junren Li, Aleksei Kornev, Piotr Gaiński, Holger Hoefling, Mike Fortunato, Rishi Gupta, Marwin Segler

arXiv:2412.05269v24.62 citationsh-index: 24

Originality Incremental advance

AI Analysis

This work addresses the bottleneck in chemical synthesis for drug discovery and manufacturing, offering a significant but incremental improvement over existing methods.

The paper tackles the problem of AI-based chemical synthesis planning, which struggles with infrequent reactions and incorrect predictions, by proposing RetroChimera, a model that ensembles diverse inductive bias models; it outperforms major models by a large margin, shows robustness and alignment with chemists, and demonstrates zero-shot transfer to pharmaceutical data.

Chemical synthesis remains a critical bottleneck in the discovery and manufacture of functional small molecules. AI-based synthesis planning models could be a potential remedy to find effective syntheses, and have made progress in recent years. However, they still struggle with less frequent, yet critical reactions for synthetic strategy, as well as hallucinated, incorrect predictions. This hampers multi-step search algorithms that rely on models, and leads to misalignment with chemists' expectations. Here we propose RetroChimera: a frontier retrosynthesis model, built upon two newly developed components with complementary inductive biases, which we fuse together using a new framework for integrating predictions from multiple sources via a learning-based ensembling strategy. Through experiments across several orders of magnitude in data scale and splitting strategy, we show RetroChimera outperforms all major models by a large margin, demonstrating robustness outside the training data, as well as for the first time the ability to learn from even a very small number of examples per reaction class. Moreover, industrial organic chemists prefer predictions from RetroChimera over the reactions it was trained on in terms of quality, revealing high levels of alignment. Finally, we demonstrate zero-shot transfer to an internal dataset from a major pharmaceutical company, showing robust generalization under distribution shift. With the new dimension that our ensembling framework unlocks, we anticipate further acceleration in the development of even more accurate models.

View on arXiv PDF

Similar