Giter Club home page Giter Club logo

prosettac's Introduction

Installation requirements for PRosettaC:

1. Load python 3 by using:
module load python/3.6.4

2. Install RDKit, NumPy and pymol python packages. If you're using conda, use:
conda install -c rdkit rdkit
conda install -c anaconda numpy
conda install -c schrodinger pymol

3. Download and Install OpenBabel (http://openbabel.org/wiki/Main_Page), PatchDock (https://bioinfo3d.cs.tau.ac.il/PatchDock/) and Rosetta (https://www.rosettacommons.org/). The Rosetta versino that was used in this study is: 2019.27.post.dev+132.master.966c9eb 966c9eb6b3ab993de7aa3af5988125b7c2e464af [email protected]:RosettaCommons/main.git 2019-07-18T11:43:25.

4. For job scheduling our scripts rely on PBS (https://en.wikipedia.org/wiki/Portable_Batch_System) or Sun Grid Engine/Open Grid Engine (https://en.wikipedia.org/wiki/Oracle_Grid_Engine), Slurm Workload Manager (https://slurm.schedmd.com).

5. Set the following environment parameters: export PATCHDOCK=”/Path_to_PatchDock/”, export OB=”/Path_to_OpenBabel/”, export SCRIPTS_FOL=”/Path_to_the_git_folder/”, export ROSETTA_FOL=”/Path_to_Rosetta/”. Additionally, set either PBS_HOME=”/Path_to_PBS_bin_folder/”, or SGE_HOME=”/Path_to_SGE_bin_folder/”, depending on the scheduler you are using.

6. When working with PBS, the environment parameter PBS_O_WORKDIR=/Path_to_the_git_folder/Patchdok_Results has to be set whereever a job is executed. You can achieve this by appending

    export PBS_O_WORKDIR=/Path_to_the_git_folder/Patchdok_Results

to /etc/pbs.conf (on all cluster nodes). Another option is to write this line to a file, e.g. env.txt, and then set the enrionment parameter SCHEDULER_PARAMS=/Path_to_env.txt before starting PRosettaC (see below "Additional parameters").
When using SGE, do the same except with /etc/sge.conf. Also, use /etc/slurm.conf when using SLURM



Usage: python main.py/auto.py Protac_params.txt

In order to send the main job also to the job scheduler, use:
python extended.py/short.py Protac_params.txt

Protac_param.txt will include the input files and should look like this:

/////////////////////////////
For main.py / extended.py:
Structures: StructA.pdb StructB.pdb
Chains: A B
Heads: HeadA.sdf HeadB.sdf
Anchor atoms: 11 23
Protac: protac.smi
Full: True
////////////////////////////

Details of files for the extended version:
StructA.pdb is the structure of the E3 ligase (e.g. CRBN)
StructB.pdb is the structure of the target protein (e.g. BRD4)
A and B are the chain IDs for the E3 ligase and target. If your structures have more than one chain, write them like this: Chains: AC B. There should not be identical chain IDs for the E3 ligase and the target.
HeadA.sdf is the E3 ligase binder (e.g. thalidomide)
HeadB.sdf is the target binder (e.g. JQ1)
Both HeadA and HeadB should be positioned in their appropriate binding modes within StructA and StructB.
Anchor atoms is the number of the anchor atoms within HeadA and HeadB. An easy way to find the number of a desired atom is to open HeadA.sdf in pymol, press the L button in the toolbar, then choose "atom identifiers", and "ID". Just make sure that the anchor should be >= 1, so in case the numbering in pymol starts from 0, add one to the number of the chosen atom. The anchor atoms must be uniquely defined in the smiles representation of the molecule.

/////////////////////////////
For auto.py / short.py:
PDB: PDB_ID1 PDB_ID2
LIG: LIG_ID1 LIG_ID2
PROTAC: protac.smi
Full: True
/////////////////////////////

For a more flexible run with .pdb or .sdf file instead of the PDB or LIG IDs, for either one or both proteins, use auto.py / short.py:

/////////////////////////////
PDB: [PDB_ID1 ; pdb1.pdb] [PDB_ID2 ; pdb2.pdb]
LIG: [LIG_ID1 ; lig1.sdf] [LIG_ID2 ; lig2.sdf]
PROTAC: [SMILES_STRING ; protac.smi]
Full: True
/////////////////////////////

NOTES: 
[X ; Y] means that either one could be chosen. The only requirement is that .sdf file cannot be chosen along with PDB_ID.
protac.smi is a file with a single line which is the smiles representation of the full PROTAC.
Due to a reserved name, the params file must not be called params.txt.
Use "Full: True" for a full and long run, and "Full: False" for a shorter run of lower quality, with less global and local docking runs
Use "ClusterName: [PBS|SGE|SLURM]" to set the scheduler (if you use PBS you don't need to specify it at all, because that's the default)


Additional parameters:
In the config file you can also specify the optional parameters RosettaDockMemory and ProtacModelMemory. These set the
amount of memory assigned per job at step 3 and 4 described in the paper and default to 8000mb and 4000mb, respectively.

You can also set the environment variable SCHEDULER_PARAMS to a file, whose content is prepended to each job file sent to the scheduler. This is useful for scheduler-specific job-level configuration (e.g. number of dedicated cpus) or for setup like activating a conda environement to run the job in etc.


How to implement a scheduler
To add support for a specific scheduler simply implement the abstract class Cluster in cluster/Cluster.py like cluster.PBS.PBS and cluster.SGE.SGE do. Then add the scheduler as entry in cluster/__init__.py.

Authors:
Daniel Zaidman, Nir London

prosettac's People

Contributors

danielzaidman avatar pharmcadd-hong avatar londonlab avatar limsande avatar sahikat avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.