Giter Club home page Giter Club logo

retrosynthesis's Introduction

Transformer for Retrosynthesis

The repository contains the source code and trained models for the paper "A Transformer model for retrosynthesis" https://link.springer.com/chapter/10.1007/978-3-030-30493-5_78. The retrosynthesis task is treated as a machine translation problem where the source language is a molecule one wants to synthesize, and the target language is the set of reactants suitable for the synthesis.

Dependencies

The code has been tested within Linux Ubuntu and OpenSuse environments with the following versions of the major components:

  1. python v.3.4.6 or higher
  2. tensorflow v.1.12
  3. rdkit v.2018.09.2
  4. python additional libraries: yaml, h5py, matplotlib, argparse, tqdm.

How to train models from scratch

To train a new model use the command:

python3 transformer.py --train=file_with_reactions

The format of the file is as follows. Each line contains a single reaction with one product. The product is written first, then all reactants and reagents. Sample files are located in the data subdirectory of the project. The model will be trained for one tousands of epochs with cyclic learning rate schedule (see the article). After training the weights at 600, 700, 800, 900, 999 epochs will be averaged and the final model will be stored in final.h5 file in the current directory. If you reaction dataset contains some not common elements then you have to increase model's vocabulary on line 53 (transformer.py).

Using the trained models

To infer model prediction use the command:

python3 transformer.py --model=final.h5 --predict=file_with_products.smi --beam=5 --temperature=1.0

It is possible to apply greedy search setting the beam size to 1. From our experience there is no valuable differences to use beam size more than 5.

Retrain the models

To retrain a model for a particular reaction dataset use the command:

python3 transformer.py --retrain=original.h5 --train=file_with_reactions

The new model will be trained for 100 epochs with decreasing learning rate and last 10 epochs will be average for the fianl weights. Again the final model will be saved to final.h5 in the current directory.

retrosynthesis's People

Contributors

bigchem avatar carpovpv avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.