Translating 'reagents and reactants' to 'products'.
[ArXiv] Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions
Finding the main product of a chemical reaction is one of the important problems of organic chemistry. This paper describes a method of applying a neural machine translation model to the prediction of organic chemical reactions. In order to translate 'reactants and reagents' to 'products', a gated recurrent unit based sequence-to-sequence model and a parser to generate input tokens for model from reaction SMILES strings were built. Training sets are composed of reactions from the patent databases, and reactions manually generated applying the elementary reactions in an organic chemistry textbook of Wade. The trained models were tested by examples and problems in the textbook. The prediction process does not need manual encoding of rules (e.g., SMARTS transformations) to predict products, hence it only needs sufficient training reaction sets to learn new types of reactions.
The code in this repository is partly based on: