Giter Club home page Giter Club logo

molgrad's Introduction

molgrad

DOI

Supporting code for: Jiménez-Luna et al.'s "Coloring molecules with explainable artificial intelligence for preclinical relevance assessment", available as a preprint in ChemRxiv

Installation

The recommended method of usage is via the Anaconda Python distribution. One can use the provided conda environment in the repository (should work for most *nix systems):

conda env create -f environment.yml

To use the graph neural-network models that were trained for the manuscript (plasma protein binding, Caco-2 passive permeability, hERG & CYP3A4 inhibition), you need to download them from:

wget https://polybox.ethz.ch/index.php/s/dDDMzi3rTbqkWOV/download -O models.tar.gz
tar -xf models.tar.gz

Then activate the environment and prepend the folder to your PYTHONPATH environment variable:

conda activate molgrad
export PYTHONPATH=/path_to_repo_root/:$PYTHONPATH

(Optional) Download datasets

All the training data used in this study can be freely downloaded from:

wget https://polybox.ethz.ch/index.php/s/K0orABbeJmwOUEh/download -O data.tar.gz
tar -xf data.tar.gz

Usage

In order to generate explanations for a particular molecule, given a trained model, one only needs to call the main.py script. A CUDA-capable GPU is encouraged, but not required:

python molgrad/main.py -model_path model_weights.pt -smi SMILES -output_f RESULT_DIR

For instance, if we wanted to obtain feature colorings for nicotine for the hERG inhibition pre-trained endpoint, and store it under a home subfolder named results, one would do:

python molgrad/main.py -model_path molgrad/models/hERG_noHs.pt -smi "CN1CCCC1C2=CN=CC=C2" -output_f $HOME/results/

This will create a comma-separated file global.csv in that folder, with feature attributions corresponding to global variables (i.e. molecular weight, log P, TPSA, and number of hydrogen donors). Another subfolder svg will be created with the produced feature colorings.

Further parameters (such as feeding an entire .smi) for batch prediction and coloring can be checked via the provided help:

python molgrad/main.py --help

(Optional) Train your own models:

The current framework also provides functionality for model training using custom data with the train_ext.py script. It assumes training data comes in a comma-separated (.csv) file, with one column carrying SMILES and another the target value, whose names need to be specified. For instance:

python molgrad/train_ext.py -data CSV_FILE -smiles_col "SMILES_COL" -target_col "TARGET_COL" -output path_to_weights.pt

The trained model can be then used to color molecules via the main.py routine as described above. Additional training options can be consulted with:

python molgrad/train_ext.py --help

Data collection for XAI model validation

A comma-separated file with examples drawn from the literature to validate this and other XAI approaches can be downloaded from here.

Citation

If you use this code (or parts thereof), please use the following BibTeX entry:

@article{jimenez2020color,
author = "Jose Jimenez-Luna and Miha Skalic and Nils Weskamp and Gisbert Schneider",
title = "{Coloring Molecules with Explainable Artificial Intelligence for Preclinical Relevance Assessment}",
year = "2020",
month = "11",
url = "https://chemrxiv.org/articles/preprint/Coloring_Molecules_with_Explainable_Artificial_Intelligence_for_Preclinical_Relevance_Assessment/13252286",
doi = "10.26434/chemrxiv.13252286.v1"
}

molgrad's People

Contributors

josejimenezluna avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.