rvanasa / deep-antibody Goto Github PK
View Code? Open in Web Editor NEWCOVID-19 monoclonal antibody screening research.
COVID-19 monoclonal antibody screening research.
Docking simulators give a useful although rather janky metric for how an antibody would potentially bind to the target antigen. Most of these tools generate thousands of possible configurations and then leave the user to perform some sort of post-processing such as energy minimization to retrieve useful information from the simulation.
We are focusing on three different docking tools:
Each of these has its own benefits and drawbacks. Hex is very fast and relatively easy to use; Frodock is state-of-the art but somewhat slower; and ZDock is well-established but relatively inaccurate and slow.
The goal is to create a pipeline where we can automatically dock antibodies to our target and evaluate how much the three simulators agree with each other. Configurations with the most agreement across simulations have been demonstrated to be much more reliable in their contact predictions.
The idea is that we feed the contact points from this pipeline into the deep learning model, which returns a score based on how strong the docking configuration would be in real life.
Fine-tuning the docking simulations is going to require learning how proteins and ligands interact with each other, so this is a perfect entry point for anyone who wants to dig into the microbiological side of the project.
How to start:
$ pip install pdb-tools
$ pdb_selchain -H,L 6w41.pdb > receptor.pdb
$ pdb_selchain -C 6w41.pdb > ligand.pdb
Common terminology:
Note that in most simulators, the antibody is called the "receptor" while the antigen is called the "ligand." If you need any other clarification on terminology, be sure to let me know since other people will probably run into it as well.
Here's a bit of documentation for Hex: http://hex.loria.fr/manual800/hex_manual.pdf
An extremely important challenge in antibody design is making "no-go" predictions to filter out antibodies that would not be able to bind to the target antigen. Since machine learning has been shown to improve the precision of these guesses, we can use a neural network or other machine learning model to significantly improve the results of the antibody screening process.
The dataset we have created has 512 columns and about 125,000 rows with boolean labels. This is a very similar setup to the famous MNIST handwritten digit classification task.
Because this task has lots of possible approaches, this is a perfect entry point if you want to learn how to design neural networks and/or have a clever idea for how to tackle this challenge.
Recommended Python packages:
Relevant papers:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.