Giter Club home page Giter Club logo

epocs's Introduction

EPoCS

This package implements EPoCS - an ESM-based Pocket Cross-Similarity metric for the comparison and contextualisation of protein binding sites, as well as systematic debiasing of train-test splits for pocket-centric machine-learning models. EPoCS combines protein language models (specifically, ESM-2) with real-space tesselation to generate vector embeddings for protein binding sites. The embeddings are the basis of the EPoCS similarity metric that gives rise to the pocket atlas. See the bioRxiv preprint for details.

Installation

For python3.9:

conda create --prefix ./venv python=3.9
conda activate ./venv
conda install -c tmap tmap  # requires python>=3.9,<3.10.0a0

For python>3.9 we need to build tmap from source. Also, note that ESM is not officially supported for python>3.9.

conda create --prefix ./venv python=3.12 gcc_linux-64=13.2 gxx_linux-64=13.2
conda activate ./venv
conda install ogdf -c tmap
conda install cmake pillow numpy scipy matplotlib  # some tmap dependencies
pip install git+https://github.com/reymond-group/tmap.git

Then, for any python3:

conda install conda-forge::pymol-open-source
pip3 install torch torchvision torchaudio
conda install biopandas pytest scipy tqdm -c conda-forge
pip install fair-esm
pip install faerun

Tests

To verify your installation, run

pytest tests/ --esm_parameters_path /path/to/esm2_t36_3B_UR50D.pt

Usage

The input file should include the paths for protein and ligand files in .cif or .pdb formats. Instead of an explicit ligand input, you can provide a 3D point (x,y,z) as reference point for the tesselation. You can set custom paths for file output, see run_epocs.py --help. If you run EPoCS for thousands of pockets, make sure to consider disk space requirements in advance.

If downloading the ESM model weights from server is not possible or desirable, you can point to an existing parameter file using --esm_parameters_path /path/to/esm/weights/esm2_t36_3B_UR50D.pt.

Example:

python run_epocs.py -f ./example/pocket_list -pp /path/to/esm2_t36_3B_UR50D.pt -np 8

Citation

The manuscript is available on bioRxiv, please cite if you found the method and/or code useful:

@article{oruc_epocs_2024,
    doi = {...},
    url = {...},
    author = {...},
}

epocs's People

Stargazers

David Zhu avatar

Watchers

 avatar Maria Kadukova avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.