LGDec 29, 2016

Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions

arXiv:1612.09529v19.387 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of automating reaction prediction in organic chemistry, offering a data-driven alternative to rule-based methods, though it appears incremental as it adapts existing neural translation techniques to a new domain.

The paper tackled the problem of predicting the main product of organic chemical reactions by applying a neural machine translation model to translate reactants and reagents to products, achieving a method that learns from training sets without manual rule encoding.

Finding the main product of a chemical reaction is one of the important problems of organic chemistry. This paper describes a method of applying a neural machine translation model to the prediction of organic chemical reactions. In order to translate 'reactants and reagents' to 'products', a gated recurrent unit based sequence-to-sequence model and a parser to generate input tokens for model from reaction SMILES strings were built. Training sets are composed of reactions from the patent databases, and reactions manually generated applying the elementary reactions in an organic chemistry textbook of Wade. The trained models were tested by examples and problems in the textbook. The prediction process does not need manual encoding of rules (e.g., SMARTS transformations) to predict products, hence it only needs sufficient training reaction sets to learn new types of reactions.

View on arXiv PDF

Similar