The protein is represented with a multiple sequence alignment and the ligand as a SMILES string, allowing for unconstrained flexibility in the protein-ligand interface. At a high accuracy threshold, unseen protein-ligand complexes can be predicted more accurately than for RoseTTAFold-AA, and at medium accuracy even classical docking methods that use known protein structures as input are surpassed.
Umol is available under the Apache License, Version 2.0.
The Umol parameters are made available under the terms of the CC BY 4.0 license.
The entire installation takes <1 hour on a standard computer.
We assume you have CUDA12. For CUDA11, you will have to change the installation of some packages.
The runtime will depend on the GPU you have available and the size of the protein-ligand complex you are predicting.
On an NVIDIA A100 GPU, the prediction time is a few minutes on average.
First install miniconda, see: https://docs.conda.io/projects/miniconda/en/latest/miniconda-install.html or https://docs.conda.io/projects/miniconda/en/latest/miniconda-other-installer-links.html
bash install_dependencies.sh
conda activate umol
bash predict.sh
Structure prediction of protein-ligand complexes from sequence information with Umol Patrick Bryant, Atharva Kelkar, Andrea Guljas, Cecilia Clementi, Frank Noé bioRxiv 2023.11.03.565471; doi: https://doi.org/10.1101/2023.11.03.565471