Giter Club home page Giter Club logo

deep-drug-coder's Introduction

DeepDrugCoder (DDC): Heteroencoder for molecular encoding and de novo generation

Maintenance License: GPL v3 DOI Python 3.6 Open Source Love svg1 Code style: black

[UPDATE] 30-10-2019: The code now only supports tensorflow-gpu >= 2.0.


Code for the purposes of Direct Steering of de novo Molecular Generation using Descriptor Conditional Recurrent Neural Networks (cRNNs).

Cheers if you were brought here by this blog post. If not, give it a read :)


Deep learning has acquired considerable momentum over the past couple of years in the domain of de-novo drug design. Particularly, transfer and reinforcement learning have demonstrated the capability of steering the generative process towards chemical regions of interest. In this work, we propose a simple approach to the focused generative task by constructing a conditional recurrent neural network (cRNN). For this purpose, we aggregate selected molecular descriptors along with a QSAR-based bioactivity label and transform them into initial LSTM states before starting the generation of SMILES strings that are focused towards the aspired properties. We thus tackle the inverse QSAR problem directly by training on molecular descriptors, instead of iteratively optimizing around a set of candidate molecules. The trained cRNNs are able to generate molecules near multiple specified conditions, while maintaining an output that is more focused than traditional RNNs yet less focused than autoencoders. The method shows promise for applications in both scaffold hoping and ligand series generation, depending on whether the cRNN is trained on calculated scalar molecular properties or structural fingerprints. This also demonstrates that fingerprint-to-molecule decoding is feasible, leading to molecules that are similar โ€“ if not identical โ€“ to the ones the fingerprints originated from. Additionally, the cRNN is able to generate a larger fraction of predicted active compounds against the DRD2 receptor when compared to an RNN trained with the transfer learning model.

Currently only GPU version of the model is supported. You need access to a GPU to use it.

More detailed instructions are to be pushed soon. Please refer to the demo notebooks for usage details.

Figure from manuscript


Custom Dependencies

Installation

  • Install git-lfs as instructed here. This is necessary in order to download the datasets.
  • Clone the repo and navigate to it.
  • Create a predefined Python3.6 conda environment by conda env create -f env/ddc_env.yml. This ensures that you have the correct version of rdKit and cudatoolkit.
  • Run pip install . to install remaining dependencies and add the package to the Python path.
  • Add the environment in the drop-down list of jupyter by python -m ipykernel install --user --name ddc_env --display-name "ddc_env (python_3.6.7)".

Usage

conda activate ddc_env
from ddc_pub import ddc_v3 as ddc

API

  • fit(): Fit a DDC model to the dataset.
  • vectorize(): Convert a binary RDKit molecule to its One-Hot-Encoded representation.
  • transform(): Encode a vectorized molecule to its latent representation.
  • predict(): Decode a latent representation into a SMILES string and return its Negative Log Likelihood (NLL).
  • predict_batch(): Decode a list of latent representations into SMILES strings and return their NLLs.
  • get_smiles_nll(): Back-calculate the NLL of a known SMILES string, if it was to be sampled by the biased decoder.
  • get_smiles_nll_batch(): Back-calculate the NLLs of a batch of known SMILES strings, if they were to be sampled by the biased decoder.
  • summary(): Display essential architectural parameters.
  • get_graphs(): Export model graphs to .png files using pydot and graphviz (might fail).
  • save(): Save the model in a .zip directory.

Issues

Please report all installation / usage issues by opening an issue at this repo.

  • Currently, we have noticed erroneous behavior of some functions with numpy.__version__==1.17.2, please stick to 1.16.5 for now.

deep-drug-coder's People

Contributors

ebjerrum avatar pcko1 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.