Giter Club home page Giter Club logo

mol-cycle-gan's Introduction

Mol-CycleGAN - a generative model for molecular optimization

Official implementation of Mol-CycleGAN for molecular optimization.

Keras CycleGan implementation is based on [tjwei/GANotebooks].

Requirements

We highly recommend to use conda for package management -- the environment.yml file is provided.

The environment can be created by running:

conda env create -f environment.yml

We use Junction Tree Variational Autoencoder implementation as a submodule in Mol-CycleGAN code. After cloning this repo, the following script should be executed before running the code

./scripts/init_repo.sh 

Datasets

We provide the user with all datasets needed to reproduce the aromatic rings experiments.

Downloading all the input data (ZINC 250k dataset and related JT-VAE encodings) can be performed by running:

./scripts/download_input_data.sh

Downloading all the data from aromatic rings experiments (train / test splits of datasets, molecules returned by Mol-CycleGAN and related SMILES) can be performed by running:

./scripts/download_ar_data.sh

Basic use

This code is an implementation of CycleGan for molecular optimization.

Training of the model can be performed by running:

python train.py

with specified training parameters.

After the model is trained and the test set translation is generated, for decoding the molecules the JT-VAE code should be used. This can be performed by running:

python decode.py

with specified decoding parameters.

Experiments

We provide all the data and code needed to reproduce the Aromatic rings experiment.

  1. In data/input_data/aromatic_rings/datasets_generator_aromatic_rings.ipynb one can find the data factory - the code that is needed to create train and test sets used in the experiment.

  2. Training of the model can be performed by running ./scripts/run_aromatic_rings_training.sh. It calls the train.py function with base parameters, which are set to process the aromatic rings data.

  3. Decoding the molecules can be performed by running ./scripts/run_aromatic_rings_decoding.sh. It calls the decode.py function with base parameters, which are set to process the aromatic rings data.

  4. The analysis of the output is provided in the notebook experiments/aromatic_rings.ipynb.

Disclaimer

The code for Mol-Cycle-Gan was natively written in Python3, however, the JT-VAE package is written in Python2. To ensure the ease of use, we used downgraded versions of packages, so that the entire experiment can be run in a single environment. Since many of those packages are outdated, we strongly recommend using the environment.yml file provided to construct the working environment.

mol-cycle-gan's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

mol-cycle-gan's Issues

How to create train data with another dataset

Hi,

I'd like your team to teach how to create train data (related JT-VAE encodings) with another dataset (e.g. Tox21).
I want to figure out the method to associate JT-VAE encodings to each molecules.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.