Giter Club home page Giter Club logo

explainable-deep-dt-representations's Introduction

Explainable Deep Drug-Target Representations for Binding Affinity Prediction

We explore the reliability of Convolutional Neural Networks (CNNs) in the identification of important regions for binding, and the significance of the deep representations by providing explanations to the model’s decisions based on the identification of the input regions that contributed the most to the prediction. Furthermore, we implement an end-to-end deep learning architecture to predict binding affinity, where CNNs are exploited in their capacity to automatically surmise and extract discriminating deep representations from 1D sequential and structural data.

End-to-End Deep Learning Architecture: Convolutional Neural Networks + Feed-Forward Fully Connected Neural Network

Chemogenomc Representative K-Fold

Regression Discriminative Localization Map

3D Docking Visualization

  • Potential Binding Sites (≤ 5 Å) : Green

  • L-Grad-RAM Hits : Blue

  • Matched Binding - L-Grad-RAM Hits : Red

ABL1(E255K)-phosphorylated - SKI-606

DDR1 - Foretinib

Binding Affinity Prediction Model

  • Two Parallel Convolution Neural Networks + Fully Connected Neural Network

Gradient-Weighted Regression Activation Mapping (Grad-RAM)

  • Global Max Pooling + Guided Gradients
  • Global Max Pooling + Non Guided Gradients
  • Global Average Pooling + Guided Gradients
  • Global Average Pooling + Non Guided Gradients

Davis Kinase Binding Affinity

Dataset

  • davis_original_dataset: original dataset
  • davis_dataset_processed: dataset processed : prot sequences + rdkit SMILES strings + pkd values
  • deep_features_dataset: CNN deep representations: protein + SMILES deep representations

Clusters

  • test_cluster: independent test set indices
  • train_cluster_X: train indices

Similarity

  • protein_sw_score: protein Smith-Waterman similarity scores
  • protein_sw_score_norm: protein Smith-Waterman similarity normalized scores
  • smiles_ecfp6_tanimoto_sim: SMILES Morgan radius 3 similarity scores

Binding

  • davis_scpdb_binding: davis-scpdb matching pairs binding information

PSSM

  • pssm_X: davis-scpdb matching pairs PSSM

sc-PDB Pairs

Binding

  • scpdb_binding: scpdb pairs binding information

PSSM

  • pssm_X: scpdb pairs PSSM

Dictionaries

  • davis_prot_dictionary: AA char-integer dictionary
  • davis_smiles_dictionary: SMILES char-integer dictionary

State-of-the-Art Baselines Data

Davis Kinase Binding Affinity Dataset + Clusters in the SOTA method format

Docking

  • abl1_pymol.pse: ABL1(E255K)-phosphorylated - SKI-606 PyMol Session
  • ddr1_pymol.pse: DDR1 - Foretinib PyMol Session

Requirements:

  • Python 3.7.9
  • Tensorflow 2.4.1
  • Numpy
  • Pandas
  • Scikit-learn
  • Itertools
  • Matplotlib
  • Seaborn
  • Glob
  • Json

Usage:

Binding Affinity Prediction

Training

python cnn_fcnn_model.py --option Training --num_cnn_layers_prot 3 --prot_filters 64 64 128 --prot_filters_w 4 4 5 --num_cnn_layers_smiles 3 --smiles_filters 64 64 128 --smiles_filters_w 4 4 5 --num_fcnn_layers 3 --fcnn_units 1024 512 1024 --drop_rate 0.5 0.1 --lr_rate 0.0001 

Validation

python cnn_fcnn_model.py --option Validation --num_cnn_layers_prot 3 --prot_filters 64 64 128 --prot_filters_w 4 4 5 --num_cnn_layers_smiles 3 --smiles_filters 64 64 128 --smiles_filters_w 4 4 5 --num_fcnn_layers 3 --fcnn_units 1024 512 1024 --drop_rate 0.5 0.1 --lr_rate 0.0001 

Evaluation

python cnn_fcnn_model.py --option Evaluation

Gradient-weighted Regression Activation Mapping (L-Grad-RAM)

Example

  • Protein Sequence : MLEICLKLVG...
  • SMILES String : Cc1cn(...
  • Window Length : 0 1 2 ...
  • Feature Importance Threshold : 0.3 0.4 0.5 ...
  • Binding Sites Positions : 5 10 50 ...
python gradram_testing.py --protein_sequence MLEICLKLVG... --smiles_string Cc1cn(... --window 0 1 2 ... --thresholds 0.3 0.4 0.5 ... --sites 5 10 50 ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.