ChemiRise: a data-driven retrosynthesis engine
This work addresses retrosynthesis for organic chemistry, potentially boosting productivity in real-life use cases, but it appears incremental as it builds on existing methods with improvements.
The authors tackled the problem of proposing complete retrosynthesis routes for organic compounds by developing ChemiRise, an end-to-end system trained on over 3 million reactions, which demonstrated better benchmark results and was rated as functional by human experts.
We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-mapping algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life use cases.