Giter Club home page Giter Club logo

gsoc19's Introduction

CNN Scoring for Flexible Docking

DOI

Powered by MDAnalysis Powered by RDKit

Abstract

Molecular docking—the prediction of binding modes and binding affinity of a molecule to a target of known structure—is a great computational tool for structure-based drug design. However, docking scoring functions are mostly empirical or knowledge-based and the flexibility of the receptor is completely neglected in most docking studies. Recent advances in the field showed that scoring functions can be effectively learnt by convolutional neural networks (CNNs). Here we want to build on top of these findings and develop a CNN scoring function for flexible docking by extending the capabilities of gnina—a state-of-the-art deep learning framework for molecular docking—and by building an high-quality training dataset for flexible docking.

Project Description

This Google Summer of Code 2019 project aims to extend the capabilities of gnina, the deep learning framework for molecular docking devloped in David Koes's group, to build a CNN-based scoring function for docking with flexible side chains.

The main stages of the project are the following:

  • Build a high-quality training dataset of docking with flexible side chains
  • Enable optimisation of flexible side chains (see PR #73)
    • Split ligand and receptor movable atoms in the correct channels
    • Combine ligand and receptor gradients for geometry optimisation
  • Train a new CNN-based scoring function for docking with flexible side chains (see mltraining/README.md)
    • Evaluate the performance of pose prediction
    • Evaluate the performance of pose optimisation
  • Iterate training on datasets augmented with CNN-optimized poses

This repository collects the different pipelines built in order to achieve the project goals. A list of constributions and fixes to openbabel, smina and gnina (OpenChemistry organisation) and MDAnalysis (NumFocus organisation) is given below.

The datasets related to this project will be released on Zenodo in due time.

Poster

Contributions

GNINA

List of contributions to gnina and gnina-scripts:

  • Optimisation of flexible side chains (PR #73)
  • Added option to pymol_arrows.py (PR #31)
  • Low-memory and faster substitute combine_rows.py (PR #30)
  • Attempt to decrease memory usage of combine_rows.py (PR #29)
  • Added serialization of struct residue (PR #74)
  • Small fixes to gninavis for gradients (PR #72)
  • Fixed Python3 pickle in clustering pipeline (PR #26)
  • Added insertion code support to makeflex.py (PR #65)
  • Improved makeflex.py script to deal with PDB file without atom types (PR #64)
  • Added test support for newer versions of Boost (PR #62)
  • Provided documentation and PDB standardization for makeflex.py script (PR #61)
  • Provided fixes for the makeflex.py script (PR #60)
  • Raised issue about gnina parallel compilation without libmolgrid installed (Issue #57)
  • Updated PDBQTUtilities.cpp to latest OpenBabel version (PR #59)

LibMolGrid

List of contributions to libmolgrid:

  • Fixed issue with unsupported CUDA architecture (PR #5)

SMINA

List of contributions to smina:

  • Fixed a problem with proline residues, broken by flexible docking (MR #3)

OpenBabel

List of contributions to openbabel:

  • Fixed various problems with PDB and PDBQT insertion codes (PR #1998)
  • Fixed CMake when compiling without RapidJSON (PR #1988)

MDAnalysis

List of contributions to MDAnalysis:

  • Improved mass guess (PR #2331)
  • Fixed issues with PDB HEADER field in PDBReader and PDBWriter (PR #2325)
  • Allowed MOL2 parser to ignore status bit strings (PR #2319)

Mentors

  • Dr. David Ryan Koes, Assistant Professor, Department of Computational and Systems Biology, University of Pittsburgh
  • Jocelyn Sunseri, Computational Biology Doctoral Candidate, Carnegie Mellon and University of Pittsburgh

gsoc19's People

Contributors

rmeli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gsoc19's Issues

How to modify the bash (or partially use these bash) to create PDBbind2019 Refined types file (Crystal&Docked)?

Dear @RMeli, I have noticed you are an expert on this area in the gnina program.
It is surprising to find that you develop this gsoc19 repository for us to quickly extract and generates files for PDBbind, which will be very helpful.
I try to use gnina to train the model on docked poses data of PDBbind2019 refined-set generated by gnina before, how can I make Crystal types and Docked types respectly through your scripts? I couldn't find how to assign the affnity, rmsd and label into the types file...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.