Giter Club home page Giter Club logo

alphafold_non_docker's People

Contributors

avilella avatar mpvenkatesh avatar old-shatterhand avatar pasqm avatar sanjaysrikakulam avatar sarahbeecroft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alphafold_non_docker's Issues

RuntimeError: HHblits failed

Dear author:
I followed the steps to configure the environment(CPU), but at the end run_alphafold.sh reported an error:

/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/flags/validators.py:203: UserWarning: Flag --preset has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
I0811 03:41:17.217879 140270426367808 templates.py:837] Using precomputed obsolete pdbs ./DOWNLOAD_DIR/pdb_mmcif/obsolete.dat.
I0811 03:41:18.356215 140270426367808 tpu_client.py:54] Starting the local TPU driver.
I0811 03:41:18.357059 140270426367808 xla_bridge.py:214] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
2021-08-11 03:41:18.358851: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-08-11 03:41:18.358934: W external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
I0811 03:41:18.359137 140270426367808 xla_bridge.py:214] Unable to initialize backend 'gpu': Failed precondition: No visible GPU devices.
I0811 03:41:18.359338 140270426367808 xla_bridge.py:214] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
W0811 03:41:18.359463 140270426367808 xla_bridge.py:217] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I0811 03:41:19.314316 140270426367808 run_alphafold.py:260] Have 1 models: ['model_1']
I0811 03:41:19.314738 140270426367808 run_alphafold.py:273] Using random seed 4975129475860990710 for the data pipeline
I0811 03:41:19.336503 140270426367808 jackhmmer.py:130] Launching subprocess "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmp_1s6fhn
/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ./example/query.fasta ./DOWNLOAD_DIR/uniref90/uniref90.fasta"
I0811 03:41:19.413159 140270426367808 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0811 03:50:05.036837 140270426367808 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 525.623 seconds
I0811 03:50:05.040448 140270426367808 jackhmmer.py:130] Launching subprocess "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpqrhjvvgw/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ./example/query.fasta ./DOWNLOAD_DIR/mgnify/mgy_clusters.fa"
I0811 03:50:05.241362 140270426367808 utils.py:36] Started Jackhmmer (mgy_clusters.fa) query
I0811 04:01:40.166194 140270426367808 utils.py:40] Finished Jackhmmer (mgy_clusters.fa) query in 694.879 seconds
I0811 04:01:40.621608 140270426367808 hhsearch.py:76] Launching subprocess "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/bin/hhsearch -i /tmp/tmp996yhgxj/query.a3m -o /tmp/tmp996yhgxj/output.hhr -maxseq 1000000 -d ./DOWNLOAD_DIR/pdb70/pdb70"
I0811 04:01:40.838742 140270426367808 utils.py:36] Started HHsearch query
I0811 04:12:59.336633 140270426367808 utils.py:40] Finished HHsearch query in 678.436 seconds
I0811 04:12:59.917971 140270426367808 hhblits.py:128] Launching subprocess "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/bin/hhblits -i ./example/query.fasta -cpu 4 -oa3m /tmp/tmpyemalf6z/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d ./DOWNLOAD_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d ./DOWNLOAD_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0811 04:13:00.089679 140270426367808 utils.py:36] Started HHblits query
I0811 04:13:23.778954 140270426367808 utils.py:40] Finished HHblits query in 23.689 seconds
E0811 04:13:23.779619 140270426367808 hhblits.py:138] HHblits failed. HHblits stderr begin:
E0811 04:13:23.779794 140270426367808 hhblits.py:141] - 04:13:23.681 ERROR: Could find neither hhm_db nor a3m_db!
E0811 04:13:23.779950 140270426367808 hhblits.py:142] HHblits stderr end
Traceback (most recent call last):
File "/lustre/user/lulab/gaojd/whr/alphafold/run_alphafold.py", line 303, in
app.run(main)
File "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/lustre/user/lulab/gaojd/whr/software/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/lustre/user/lulab/gaojd/whr/alphafold/run_alphafold.py", line 277, in main
predict_structure(
File "/lustre/user/lulab/gaojd/whr/alphafold/run_alphafold.py", line 127, in predict_structure
feature_dict = data_pipeline.process(
File "/lustre/user/lulab/gaojd/whr/alphafold/alphafold/data/pipeline.py", line 170, in process
hhblits_bfd_uniclust_result = self.hhblits_bfd_uniclust_runner.query(
File "/lustre/user/lulab/gaojd/whr/alphafold/alphafold/data/tools/hhblits.py", line 143, in query
raise RuntimeError('HHblits failed\nstdout:\n%s\n\nstderr:\n%s\n' % (
RuntimeError: HHblits failed
stdout:

stderr:

  • 04:13:23.681 ERROR: Could find neither hhm_db nor a3m_db!

I don't know what caused this result, so I want some help.
Thanks!

Update to Alphafold 2.3.0?

Hi,

There was a recent release of Alphafold, v.2.3.0, which amongst other things, improves the GPU VMEM efficiency of some parts of the computation.

https://github.com/deepmind/alphafold/releases/tag/v2.3.0

There don't seem to be any (?) changes in the parameters for the CLI, thus would a simple 'git pull' suffice for having the alphafold_non_docker installation updated?

The only parameter update I can see is:

  • number of recycling iterations can now be controlled, and added an option to run single chains on the multimer model.

Thx in advance,

fasta_path syntax error

I am recieving the following error when trying to run the example. Do you have nay idea? ./example/query.fasta is definitely a string.

alphafold$ bash run_alphafold.sh -d ./alphafold_data/ -o ./dummy_test/ -m model_1 -f ./example/query.fasta -t 2021-07-27
  File "/home/user/alphafold/run_alphafold.py", line 96
    fasta_path: str,
              ^
SyntaxError: invalid syntax

Could not find HHBlits database

Hi! I installed alphafold following the non_docker option using the reduced version of the databases (reduced_dbs mode), and I have this error:

bash run_alphafold.sh -d /home/k.ruiz/alphafold_data -o /home/k.ruiz/rnaseq/alphafold/output -f /home/k.ruiz/rnaseq/alphafold/input/MSTRG.4643.1_3_RBP3.fasta -t 2020-05-14

I0725 12:53:28.340466 140062189004608 templates.py:857] Using precomputed obsolete pdbs /home/k.ruiz/alphafold_data/pdb_mmcif/obsolete.dat.
E0725 12:53:28.343733 140062189004608 hhblits.py:82] Could not find HHBlits database /home/k.ruiz/alphafold_data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
Traceback (most recent call last):
  File "/home/k.ruiz/alphafold-2.2.0/run_alphafold.py", line 422, in <module>
    app.run(main)
  File "/home/k.ruiz/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/k.ruiz/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/k.ruiz/alphafold-2.2.0/run_alphafold.py", line 338, in main
    monomer_data_pipeline = pipeline.DataPipeline(
  File "/home/k.ruiz/alphafold-2.2.0/alphafold/data/pipeline.py", line 138, in __init__
    self.hhblits_bfd_uniclust_runner = hhblits.HHBlits(
  File "/home/k.ruiz/alphafold-2.2.0/alphafold/data/tools/hhblits.py", line 83, in __init__
    raise ValueError(f'Could not find HHBlits database {database_path}')
ValueError: Could not find HHBlits database /home/k.ruiz/alphafold_data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt

However, when I try to edit the run_alphafold.py file following this thread, lines 76 and 77 look different from the ones mentioned there, as my run_alphafold.py looks like:

flags.DEFINE_string('uniclust30_database_path', None, 'Path to the Uniclust30 '
                    'database for use by HHblits.')

Is there any other solution?

Thanks!

missing CIFS in pdb_mmcif?

Hello, Kalininalab support member!

I am trying alphafold 2.1.1 and this non docker version of set up. Thank you very much for providing this non-docker version, it helps us to set up on HPC cluster.

The testing I experienced is can't find CIFs,

(alphafold-2.1.1) [user@cdragon096 alphafold]$ bash run_alphafold.sh -d $HOME/alphafold/alphafold_data -o ./dummy_test/ -m model_1 -f $HOME/alphafold/alphafold_non_docker/example/query.fasta -t 2020-05-14 -g False
Unknown model preset! Using default ('monomer')
E1118 12:13:41.684854 46912496434880 templates.py:837] Could not find CIFs in $HOME/alphafold/alphafold_data/pdb_mmcif/mmcif_files
Traceback (most recent call last):
File "/home/ryao/alphafold/run_alphafold.py", line 427, in
app.run(main)
File "/risapps/rhel7/python/3.7.3/envs/alphafold-2.1.1/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/risapps/rhel7/python/3.7.3/envs/alphafold-2.1.1/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/ryao/alphafold/run_alphafold.py", line 341, in main
template_featurizer = templates.HhsearchHitFeaturizer(
File "$HOME/alphafold/alphafold/data/templates.py", line 838, in init
raise ValueError(f'Could not find CIFs in {self._mmcif_dir}')
ValueError: Could not find CIFs in $HOME/alphafold_data/pdb_mmcif/mmcif_files
(alphafold-2.1.1)

Would you please advise if I have missing anything?

Regards,
Rong

Incorrect variable name

Please change Download MGnify database in download_db.sh to the following

# Download MGnify database
echo "Downloading MGnify database"
mgnify_filename="mgy_clusters_2018_12.fa.gz"
wget -P "$mgnify" "https://storage.googleapis.com/alphafold-databases/casp14_versions/${mgnify_filename}"
(cd "$mgnify" && gunzip "$mgnify/$mgnify_filename")

obsolete data

File "/home/ngayatri/alphafold/alphafold/data/templates.py", line 137, in _parse_obsolete
with open(obsolete_file_path) as f:
IsADirectoryError: [Errno 21] Is a directory: '/home/ngayatri/mmcif_ob1/rsync.rcsb.org/pub/pdb/data/structures/obsolete/mmCIF'
I have downloaded the data with wget command from this website rsync.rcsb.org/pub/pdb/data/structures/obsolete/mmCIF
can you help me with error what exactly is it trying to find

Add aria2c to conda env

As was pointed out in the twitter aria2c is a non-standard package, especially in HPC environments. Adding it in conda env and modifying the manual in a way, such as DB download would already be in the activated conda env might be a nice move.

Low GPU memory-usage and 0 GPU-Util

Hello,

I had a problem with running alphafold. The first two hours are very smooth, and I think the MSA part is finished in these two hours. However, when it showd:

I0905 13:06:56.466166 140453353674560 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (691, 691, 64)}, 'experimentally_resolved': {'logits': (691, 37)}, 'masked_msa': {'logits': (252, 691, 22)}, 'predicted_aligned_error': (691, 691), 'predicted_lddt': {'logits': (691, 50)}, 'structure_module': {'final_atom_mask': (691, 37), 'final_atom_positions': (691, 37, 3)}, 'plddt': (691,), 'aligned_confidence_probs': (691, 691, 64), 'max_predicted_aligned_error': (), 'ptm': (), 'iptm': (), 'ranking_confidence': ()}
I0905 13:06:56.467109 140453353674560 run_alphafold.py:202] Total JAX model model_1_multimer_v2_pred_0 on VHVL predict time (includes compilation time, see --benchmark): 246.2s

This step takes forever. I checked the CPU usage, memory usage, and the GPU usage and they are:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
35488 dell 20 0 69.9g 4.8g 594148 R 100.0 3.8 1591:11 python /h+

          total        used        free      shared  buff/cache   available

Mem: 128357 6557 1730 106 120069 121081

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:3B:00.0 Off | N/A |
| 30% 33C P2 101W / 320W | 5886MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:5E:00.0 Off | N/A |
| 30% 25C P0 88W / 320W | 0MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:B1:00.0 Off | N/A |
| 30% 25C P0 89W / 320W | 0MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:D9:00.0 Off | N/A |
| 30% 25C P0 94W / 320W | 0MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 35488 C python 1020MiB |
+-----------------------------------------------------------------------------+

The GPU memory is not very high since I saw some people's A100 had a menory usage with over 20000MiB. What's more, the GPU-Util is only 0-1%. I'm not sure whether it's because the graphic driver/CUDA/CUDNN/JAX versions are not matched (driver version: 515.43.04, CUDA version: 11.7, CUDNN version: 8.4.1.50, jaxlib version: 0.3.15+cuda11.cudnn82, python version: 3.8). I didn't see any error log, but it just didn't move on for over 30 hours. I also used 'conda activate alphafold' and tested in python3:

import torch
print(torch.cuda.is_available())
True
from torch.backends import cudnn
print(cudnn.is_available())
True

It seems that the CUDA and CUDNN works. So I'm confused and did anyone have this problem before and could you please kindly teach me how to solve it? Thanks a lot for your kind guide.

HHblits failed

Who could do me a favor?
when I run AF2, came up with HHblits error.
script:
python ./run_alphafold.py --fasta_paths=XXX.fas

results:
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --output_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --model_names has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --data_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --preset has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --uniref90_database_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --mgnify_database_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --uniclust30_database_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --bfd_database_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --pdb70_database_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --template_mmcif_dir has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --max_template_date has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
/home/linlab/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --obsolete_pdbs_path has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
I0827 14:21:24.269093 140066848117952 templates.py:880] Using precomputed obsolete pdbs /data1/AF2_Database/pdb_mmcif/obsolete.dat.
I0827 14:21:26.650146 140066848117952 xla_bridge.py:236] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker:
I0827 14:21:27.050851 140066848117952 xla_bridge.py:236] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0827 14:21:38.779491 140066848117952 run_alphafold.py:293] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5']
I0827 14:21:38.779775 140066848117952 run_alphafold.py:306] Using random seed 8466485706823161682 for the data pipeline
I0827 14:21:38.780414 140066848117952 pipeline.py:130] query uniref90
I0827 14:21:38.780694 140066848117952 jackhmmer.py:119] Launching subprocess "jackhmmer -o /dev/null -A /tmp/tmpng40rd11/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 GARS_insertion.fas /data1/AF2_Database/uniref90/uniref90.fasta"
I0827 14:21:38.840389 140066848117952 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0827 14:28:35.052657 140066848117952 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 416.212 seconds
I0827 14:28:35.053834 140066848117952 pipeline.py:141] query mgnify
I0827 14:28:35.054065 140066848117952 jackhmmer.py:119] Launching subprocess "jackhmmer -o /dev/null -A /tmp/tmp48u78l3n/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 GARS_insertion.fas /data1/AF2_Database/mgnify/mgy_clusters.fa"
I0827 14:28:35.103244 140066848117952 utils.py:36] Started Jackhmmer (mgy_clusters.fa) query
I0827 14:36:06.724052 140066848117952 utils.py:40] Finished Jackhmmer (mgy_clusters.fa) query in 451.621 seconds
I0827 14:36:06.727063 140066848117952 pipeline.py:153] query mgnify
I0827 14:36:06.727788 140066848117952 hhblits.py:128] Launching subprocess "hhblits -i GARS_insertion.fas -cpu 4 -oa3m /tmp/tmptyl5pi44/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /data1/AF2_Database/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /data1/AF2_Database/uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0827 14:36:06.858779 140066848117952 utils.py:36] Started HHblits query
I0827 14:38:43.267431 140066848117952 utils.py:40] Finished HHblits query in 156.408 seconds
E0827 14:38:43.267613 140066848117952 hhblits.py:138] HHblits failed. HHblits stderr begin:
E0827 14:38:43.267664 140066848117952 hhblits.py:141] - 14:36:31.573 INFO: Searching 65983866 column state sequences.
E0827 14:38:43.267698 140066848117952 hhblits.py:141] - 14:36:32.498 INFO: Searching 15161831 column state sequences.
E0827 14:38:43.267728 140066848117952 hhblits.py:141] - 14:36:32.569 INFO: GARS_insertion.fas is in A2M, A3M or FASTA format
E0827 14:38:43.267756 140066848117952 hhblits.py:141] - 14:36:32.569 INFO: Iteration 1
E0827 14:38:43.267784 140066848117952 hhblits.py:141] - 14:36:32.607 INFO: Prefiltering database
E0827 14:38:43.267811 140066848117952 hhblits.py:141] - 14:37:27.399 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 735240
E0827 14:38:43.267838 140066848117952 hhblits.py:141] - 14:38:40.391 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 198439
E0827 14:38:43.267866 140066848117952 hhblits.py:141] - 14:38:41.184 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 2000
E0827 14:38:43.267893 140066848117952 hhblits.py:141] - 14:38:41.184 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2000
E0827 14:38:43.267920 140066848117952 hhblits.py:141] - 14:38:41.184 INFO: Scoring 2000 HMMs using HMM-HMM Viterbi alignment
E0827 14:38:43.267946 140066848117952 hhblits.py:141] - 14:38:41.286 INFO: Alternative alignment: 0
E0827 14:38:43.267974 140066848117952 hhblits.py:142] HHblits stderr end
Traceback (most recent call last):
File "../run_alphafold.py", line 338, in
app.run(main)
File "/home/linlab/.local/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/linlab/.local/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "../run_alphafold.py", line 310, in main
predict_structure(
File "../run_alphafold.py", line 170, in predict_structure
feature_dict = data_pipeline.process(
File "/home/linlab/Applications/alphafold/alphafold/data/pipeline.py", line 154, in process
hhblits_bfd_uniclust_result = self.hhblits_bfd_uniclust_runner.query(
File "/home/linlab/Applications/alphafold/alphafold/data/tools/hhblits.py", line 143, in query
raise RuntimeError('HHblits failed\nstdout:\n%s\n\nstderr:\n%s\n' % (
RuntimeError: HHblits failed
stdout:

stderr:

  • 14:36:31.573 INFO: Searching 65983866 column state sequences.

  • 14:36:32.498 INFO: Searching 15161831 column state sequences.

  • 14:36:32.569 INFO: GARS_insertion.fas is in A2M, A3M or FASTA format

  • 14:36:32.569 INFO: Iteration 1

  • 14:36:32.607 INFO: Prefiltering database

  • 14:37:27.399 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 735240

  • 14:38:40.391 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment) : 198439

  • 14:38:41.184 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment) : 2000

  • 14:38:41.184 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2000

  • 14:38:41.184 INFO: Scoring 2000 HMMs using HMM-HMM Viterbi alignment

  • 14:38:41.286 INFO: Alternative alignment: 0

The B-factor field (pLDDT confidence measure) of the output PDB file is always 0

Hi~Thanks for your work. And I have run alphafold without docker successfully following your tutorial. But the output PDB file may be something wrong:

The B-factor field, which is stored pLDDT confidence measure, of the output PDB file is always 0 in my test work based on the example query.fasta.

WechatIMG281

ๆˆชๅฑ2021-08-16 ไธ‹ๅˆ2 31 34

And I saw the same result in your dummy_test/query/anked_0.pdb.
WechatIMG282

Then I tried to use the AlphaFold Colab notebook demo for validation. The result from AlphaFold Colab notebook seems no problem.

WechatIMG283

ๆˆชๅฑ2021-08-16 ไธ‹ๅˆ2 41 42

So is there any way to fix it ?
Thanks a lot.

"The preceding stack trace is the source of the JAX operation" error

Hi,

I am trying alphafold_non_docker on a small-ish GPU (2Gb) with a small test protein (same as in ColabFold). I am getting this indecipherable error, hopefully someone can illuminate what's happening:

I1116 11:22:05.455623 139818632742720 model.py:131] Running predict with shape(feat) = {'aatype': (4, 59), 'residue_index': (4, 59), 'seq_length': (4,), 'template_aatype': (4, 4, 59), 'template_all_atom_masks': (4, 4, 59, 37), 'template_all_atom_positions': (4, 4, 59, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 59), 'msa_mask': (4, 508, 59), 'msa_row_mask': (4
, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 59, 3), 'template_pseudo_beta_mask': (4, 4, 59), 'atom14_atom_exists': (4, 59, 14), 'residx_atom14_to_atom37': (4, 59, 14), 'residx_atom37_to_atom14': (4, 59, 37), 'atom37_atom_exists': (4, 59, 37), 'extra_msa': (4, 5120, 59), 'extra_msa_mask': (4, 5120, 59), 'extra_msa_row_mask': (4, 5120), 'bert_m
ask': (4, 508, 59), 'true_msa': (4, 508, 59), 'extra_has_deletion': (4, 5120, 59), 'extra_deletion_value': (4, 5120, 59), 'msa_feat': (4, 508, 59, 49), 'target_feat': (4, 59, 22)}                                                                                                                                                                                                                             
Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                                              
  File "/home/user/alphafold/run_alphafold.py", line 310, in <module>                                                                                                                                                                                                                                                                                                                                       
    app.run(main)                                                                                                                                                                                                                                                                                                                                                                                               
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run                                                                                                                                                                                                                                                                                                     
    _run_main(main, args)                                                                                                                                                                                                                                                                                                                                                                                       
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main                                                                                                                                                                                                                                                                                               
    sys.exit(main(argv))                                                                                                                                                                                                                                                                                                                                                                                        
  File "/home/user/alphafold/run_alphafold.py", line 284, in main                                                                                                                                                                                                                                                                                                                                           
    predict_structure(                                                                                                                                                                                                                                                                                                                                                                                          
  File "/home/user/alphafold/run_alphafold.py", line 149, in predict_structure                                                                                                                                                                                                                                                                                                                              
    prediction_result = model_runner.predict(processed_feature_dict)                                                                                                                                                                                                                                                                                                                                            
  File "/home/user/alphafold/alphafold/model/model.py", line 133, in predict                                                                                                                                                                                                                                                                                                                                
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)                                                                                                                                                                                                                                                                                                                                               
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/transform.py", line 125, in apply_fn                                                                                                                                                                                                                                                                                    
    out, state = f.apply(params, {}, *args, **kwargs)                                                                                                                                                                                                                                                                                                                                                           
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/transform.py", line 313, in apply_fn                                                                                                                                                                                                                                                                                    
    out = f(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/alphafold/alphafold/model/model.py", line 59, in _forward_fn                                                                                                                                                                                                                                                                                                                             
    return model(                                                                                                                                                                                                                                                                                                                                                                                               
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 428, in wrapped                                                                                                                                                                                                                                                                                        
    out = f(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 279, in run_interceptors                                                                                                                                                                                                                                                                               
    return bound_method(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                        
  File "/home/user/alphafold/alphafold/model/modules.py", line 376, in __call__                                                                                                                                                                                                                                                                                                                             
    _, prev = hk.while_loop(                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 610, in while_loop                                                                                                                                                                                                                                                                                   
    val, state = jax.lax.while_loop(pure_cond_fun, pure_body_fun, init_val)                                                                                                                                                                                                                                                                                                                                     
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 605, in pure_body_fun                                                                                                                                                                                                                                                                                
    val = body_fun(val)                                                                                                                                                                                                                                                                                                                                                                                         
  File "/home/user/alphafold/alphafold/model/modules.py", line 369, in <lambda>                                                                                                                                                                                                                                                                                                                             
    get_prev(do_call(x[1], recycle_idx=x[0],                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/alphafold/alphafold/model/modules.py", line 337, in do_call                                                                                                                      
    return impl(                                                                                                                                                                                                                                                                                                                                                                                                
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 428, in wrapped                                                                                                                                                                                                                                                                                        
    out = f(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 279, in run_interceptors                                                                                                                                                                                                                                                                               
    return bound_method(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                        
  File "/home/user/alphafold/alphafold/model/modules.py", line 161, in __call__                                                                                                                     
    representations = evoformer_module(batch0, is_training)                                                                                                                                                                                                                                                                                                                                                     
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 428, in wrapped                                                                                                                                                                                                                                                                                        
    out = f(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 279, in run_interceptors                                                                                                                                                                                                                                                                               
    return bound_method(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                        
  File "/home/user/alphafold/alphafold/model/modules.py", line 1764, in __call__                                                                                                                    
    template_pair_representation = TemplateEmbedding(c.template, gc)(                                                                                                                                                                                                                                                                                                                                           
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 428, in wrapped                                                                                                                                                                                                                                                                                        
    out = f(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                                    
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 279, in run_interceptors                                                                                                                                                                                                                                                                               
    return bound_method(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                                                        
  File "/home/user/alphafold/alphafold/model/modules.py", line 2059, in __call__                                                                                                                    
    template_pair_representation = mapping.sharded_map(map_fn, in_axes=0)(                                                                                                                                                                                                                                                                                                                                      
  File "/home/user/alphafold/alphafold/model/mapping.py", line 182, in mapped_fn                                                                                                                    
    outputs, _ = hk.scan(scan_iteration, outputs, slice_starts)                                                                                                                                                                                                                                                                                                                                                 
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 504, in scan                                                                                                                                                                                                                                                                                         
    (carry, state), ys = jax.lax.scan(                                                                                                                                                                  
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 487, in stateful_fun                                                                                                                                                                                                                                                                                 
    carry, out = f(carry, x)                                                                        
  File "/home/user/alphafold/alphafold/model/mapping.py", line 171, in scan_iteration                                                                                                               
    new_outputs = compute_shard(outputs, i, shard_size)                                             
  File "/home/user/alphafold/alphafold/model/mapping.py", line 165, in compute_shard                                                                                                                
    slice_out = apply_fun_to_slice(slice_start, slice_size)                                                                                                                                             
  File "/home/user/alphafold/alphafold/model/mapping.py", line 138, in apply_fun_to_slice                                                                                                           
    return fun(*input_slice)                                                                        
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 567, in mapped_fun                                                                                                                                                                                                                                                                                   
    out, state = mapped_pure_fun(args, state)                                                                                                                                                           
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/stateful.py", line 558, in pure_fun                                                                                                                                                                                                                                                                                     
    out = fun(*args)                                                                                                                                                                                    
  File "/home/user/alphafold/alphafold/model/modules.py", line 2057, in map_fn                                                                                                                      
    return template_embedder(query_embedding, batch, mask_2d, is_training)                                                                                                                              
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 428, in wrapped                                                                                                                                                                                                                                                                                        
    out = f(*args, **kwargs)                                                                                                                                                                            
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/module.py", line 279, in run_interceptors                                                                                                                                                                                                                                                                               
    return bound_method(*args, **kwargs)                                                                                                                                                                
  File "/home/user/alphafold/alphafold/model/modules.py", line 1963, in __call__                                                                                                                    
    quaternion=quat_affine.rot_to_quat(rot, unstack_inputs=True),                                                                                                                                       
  File "/home/user/alphafold/alphafold/model/quat_affine.py", line 113, in rot_to_quat                                                                                                              
    _, qs = jnp.linalg.eigh(k)                                                                                                                                                                                                                                                                                                                                                                                  
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/linalg.py", line 313, in eigh                                                                                                                                                                                                                                                                                       
    v, w = lax_linalg.eigh(a, lower=lower, symmetrize_input=symmetrize_input)                                                                                                                           
jax._src.source_info_util.JaxStackTraceBeforeTransformation: RuntimeError: cuSolver internal error                                                                                                      
                                                                                                    
The preceding stack trace is the source of the JAX operation that, once transformed by JAX, triggered the following exception.                                                                                                                                                                                                                                                                                  
                                                                                                    

model

Hello,it can run with no problem,I have a problem,How i can run it with all 5 Model.

run run_alphafold.sh error message

Dear author:

I followed the README file an the following command (a cpu version)

$ conda activate alphafold
(alphafold) [ryao@cdragon267 ryao]$ cd alphafold
(alphafold) [ryao@cdragon267 alphafold]$ bash run_alphafold.sh -d ./alphafold_data -o ./dummy_test/ -m model_1 -f ./alphafold_non_docker/example/query.fasta -t 2020-05-14 -g False
/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --preset has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
warnings.warn(
I0810 15:31:03.155832 46912496434880 templates.py:836] Using precomputed obsolete pdbs ./alphafold_data/pdb_mmcif/obsolete.dat.
I0810 15:31:03.363498 46912496434880 tpu_client.py:54] Starting the local TPU driver.
I0810 15:31:03.373189 46912496434880 xla_bridge.py:231] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
2021-08-10 15:31:03.374934: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /cm/local/apps/gcc/7.2.0/lib:/cm/local/apps/gcc/7.2.0/lib64:/rissched/lsf/10.1/linux3.10-glibc2.17-x86_64/lib
2021-08-10 15:31:03.374958: W external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
I0810 15:31:03.375049 46912496434880 xla_bridge.py:231] Unable to initialize backend 'gpu': Failed precondition: No visible GPU devices.
I0810 15:31:03.375171 46912496434880 xla_bridge.py:231] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
W0810 15:31:03.375225 46912496434880 xla_bridge.py:234] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I0810 15:31:03.970467 46912496434880 run_alphafold.py:259] Have 1 models: ['model_1']
I0810 15:31:03.970602 46912496434880 run_alphafold.py:272] Using random seed 2888980253009115914 for the data pipeline
I0810 15:31:03.976739 46912496434880 jackhmmer.py:130] Launching subprocess "/risapps/rhel7/python/3.7.3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpg1fput7i/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ./alphafold_non_docker/example/query.fasta ./alphafold_data/uniref90/uniref90.fasta"
I0810 15:31:03.989789 46912496434880 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0810 15:38:11.871857 46912496434880 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 427.882 seconds
I0810 15:38:11.872416 46912496434880 jackhmmer.py:130] Launching subprocess "/risapps/rhel7/python/3.7.3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpslj920ny/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ./alphafold_non_docker/example/query.fasta ./alphafold_data/mgnify/mgy_clusters.fa"
I0810 15:38:11.894569 46912496434880 utils.py:36] Started Jackhmmer (mgy_clusters.fa) query
I0810 15:47:25.491852 46912496434880 utils.py:40] Finished Jackhmmer (mgy_clusters.fa) query in 553.597 seconds
I0810 15:47:25.492514 46912496434880 hhsearch.py:76] Launching subprocess "/risapps/rhel7/python/3.7.3/envs/alphafold/bin/hhsearch -i /tmp/tmplmbbdtny/query.a3m -o /tmp/tmplmbbdtny/output.hhr -maxseq 1000000 -d ./alphafold_data/pdb70/pdb70"
I0810 15:47:25.510776 46912496434880 utils.py:36] Started HHsearch query
I0810 15:48:42.909016 46912496434880 utils.py:40] Finished HHsearch query in 77.398 seconds
I0810 15:48:42.939602 46912496434880 hhblits.py:128] Launching subprocess "/risapps/rhel7/python/3.7.3/envs/alphafold/bin/hhblits -i ./alphafold_non_docker/example/query.fasta -cpu 4 -oa3m /tmp/tmp5sk1ch3o/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d ./alphafold_data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d ./alphafold_data/uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0810 15:48:42.958906 46912496434880 utils.py:36] Started HHblits query

(alphafold) [ryao@cdragon267 alphafold]$ Traceback (most recent call last):
File "/rsrch3/home/itops/ryao/alphafold/run_alphafold.py", line 302, in
app.run(main)
File "/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/rsrch3/home/itops/ryao/alphafold/run_alphafold.py", line 276, in main
predict_structure(
File "/rsrch3/home/itops/ryao/alphafold/run_alphafold.py", line 126, in predict_structure
feature_dict = data_pipeline.process(
File "/rsrch3/home/itops/ryao/alphafold/alphafold/data/pipeline.py", line 173, in process
hhblits_bfd_uniclust_result = self.hhblits_bfd_uniclust_runner.query(
File "/rsrch3/home/itops/ryao/alphafold/alphafold/data/tools/hhblits.py", line 133, in query
stdout, stderr = process.communicate()
File "/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/subprocess.py", line 1024, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/subprocess.py", line 1866, in _communicate
ready = selector.select(timeout)
File "/risapps/rhel7/python/3.7.3/envs/alphafold/lib/python3.8/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
KeyboardInterrupt

It exited. I run this command HPC environment on a compute node. May you suggest a possible cause for this situation?

Thanks!

Jax

Is any Jax version compatible with CUDA 10.2?

HHblits failed

Dear author,
I have been getting an error in HHblits and I'm wondering if you might understand what is wrong. I tried to run the script with use_gpu=False (although I couldn't work out how this information has been passed on in the shell script).
Here is the log.

I0816 11:44:49.701374 47500092145344 hhblits.py:128] Launching subprocess "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/
conda/alphafold/bin/hhblits -i /exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/data/example_PB1F2/AFH41240.1.fasta -cpu 16
-oa3m /tmp/tmps6dyg_bz/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/me
mbers/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/databases/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/mem
bers/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/databases/uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0816 11:44:49.762680 47500092145344 utils.py:36] Started HHblits query
I0816 20:54:02.312248 47500092145344 utils.py:40] Finished HHblits query in 32952.549 seconds
E0816 20:54:02.322368 47500092145344 hhblits.py:138] HHblits failed. HHblits stderr begin:
E0816 20:54:02.322453 47500092145344 hhblits.py:141] - 11:45:38.035 INFO: Searching 65983866 column state sequences.
E0816 20:54:02.322493 47500092145344 hhblits.py:141] - 11:45:38.954 INFO: Searching 15161831 column state sequences.
E0816 20:54:02.322529 47500092145344 hhblits.py:141] - 11:45:39.035 INFO: /exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/d
ata/example_PB1F2/AFH41240.1.fasta is in A2M, A3M or FASTA format
E0816 20:54:02.322567 47500092145344 hhblits.py:141] - 11:45:39.041 INFO: Iteration 1
E0816 20:54:02.322600 47500092145344 hhblits.py:141] - 11:45:39.072 INFO: Prefiltering database
E0816 20:54:02.322632 47500092145344 hhblits.py:141] - 19:15:38.345 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 378332
E0816 20:54:02.322664 47500092145344 hhblits.py:141] - 20:54:00.555 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 141755
E0816 20:54:02.322696 47500092145344 hhblits.py:141] - 20:54:00.751 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2000
E0816 20:54:02.322729 47500092145344 hhblits.py:141] - 20:54:00.751 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2000
E0816 20:54:02.322766 47500092145344 hhblits.py:141] - 20:54:00.751 INFO: Scoring 2000 HMMs using HMM-HMM Viterbi alignment
E0816 20:54:02.322798 47500092145344 hhblits.py:141] - 20:54:01.122 INFO: Alternative alignment: 0
E0816 20:54:02.322830 47500092145344 hhblits.py:142] HHblits stderr end
Traceback (most recent call last):
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/alphafold/run_alphafold.py", line 302, in <module>
    app.run(main)
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/conda/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/conda/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/alphafold/run_alphafold.py", line 276, in main
    predict_structure(
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/alphafold/run_alphafold.py", line 126, in predict_structure
    feature_dict = data_pipeline.process(
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/alphafold/alphafold/data/pipeline.py", line 178, in process
    hhblits_bfd_uniclust_result = self.hhblits_bfd_uniclust_runner.query(
  File "/exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/alphafold/alphafold/data/tools/hhblits.py", line 143, in query
    raise RuntimeError('HHblits failed\nstdout:\n%s\n\nstderr:\n%s\n' % (
RuntimeError: HHblits failed
stdout:


stderr:
- 11:45:38.035 INFO: Searching 65983866 column state sequences.

- 11:45:38.954 INFO: Searching 15161831 column state sequences.

- 11:45:39.035 INFO: /exports/cmvm/eddie/eb/groups/EEID_Mareks_IBV/members/roslin_bioinformatics/2021-07-23-_9707_EEID_AlphaFold_setup/data/example_PB1F2/AFH41240.1.fasta is in A2M, A3M or FASTA format

- 11:45:39.041 INFO: Iteration 1

- 11:45:39.072 INFO: Prefiltering database

- 19:15:38.345 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 378332

- 20:54:00.555 INFO: HMMs passed 1st prefilter (gapless profile-profile alignment)  : 141755

- 20:54:00.751 INFO: HMMs passed 2nd prefilter (gapped profile-profile alignment)   : 2000

- 20:54:00.751 INFO: HMMs passed 2nd prefilter and not found in previous iterations : 2000

- 20:54:00.751 INFO: Scoring 2000 HMMs using HMM-HMM Viterbi alignment

- 20:54:01.122 INFO: Alternative alignment: 0


Thank you for your help and for making your non-docker alphafold solution!

Control the number of OpenMM threads

Since a non-docker setup doesn't have the cgroups limiting capabilities, it would be nice to have finer control over the relaxation step performed with OpenMM.

The default AF script uses OpenMM with a CPU platform which by default consumes all CPUs. The behavior can be tuned with the OPENMM_CPU_THREADS environment variable which can be setup similar to the GPU selection process.

For our local version I added following lines:

        c)
                openmm_threads=$OPTARG
        ;;
        <...>
      
if [[ "$openmm_threads" ]] ; then
    export OPENMM_CPU_THREADS=${openmm_threads}
fi
  
        

Error while running Alphafold without docker

Hi all,
I am facing the below error while running alphafold.
File "/mnt/Alphafold/No_doc/alphafold/run_alphafold.py", line 427, in
app.run(main)
File "/home/skhatri/anaconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/skhatri/anaconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/mnt/Alphafold/No_doc/alphafold/run_alphafold.py", line 403, in main
predict_structure(
File "/mnt/Alphafold/No_doc/alphafold/run_alphafold.py", line 166, in predict_structure
feature_dict = data_pipeline.process(
File "/mnt/Alphafold/No_doc/alphafold/alphafold/data/pipeline_multimer.py", line 266, in process
chain_features = self._process_single_chain(
File "/mnt/Alphafold/No_doc/alphafold/alphafold/data/pipeline_multimer.py", line 212, in _process_single_chain
chain_features = self._monomer_data_pipeline.process(
File "/mnt/Alphafold/No_doc/alphafold/alphafold/data/pipeline.py", line 170, in process
msa_for_templates = parsers.deduplicate_stockholm_msa(
File "/mnt/Alphafold/No_doc/alphafold/alphafold/data/parsers.py", line 350, in deduplicate_stockholm_msa
query_align = next(iter(sequence_dict.values()))
StopIteration
I don't understand where the issue is, as it runs fine with other sequences in alphafold multimer . I use the default input method for multimer .
Please help!

Running AlphaFold-Multimer error

Hello, I run the monomer prediction without any problems, but in the compound prediction, there was a mistake, I have checked my directory structure is consistent with the official. However, I made an error when downloading to the uniprot file when using download _ all _ data.sh, so I later downloaded uniprot and pdb _ seqres separately. Don 't know if there is a reason for this?

The following is the error content.

(alphafold) bash run_alphafold.sh -d /home/fsd/afdata/ -o /home/fsd/afoutput/ -f  /h
I1023 16:33:13.358484 47290920877760 templates.py:857] Using precomputed obsolete pd
I1023 16:33:14.708593 47290920877760 tpu_client.py:54] Starting the local TPU driver
I1023 16:33:14.709099 47290920877760 xla_bridge.py:212] Unable to initialize backend
I1023 16:33:15.603161 47290920877760 xla_bridge.py:212] Unable to initialize backend
I1023 16:33:23.785693 47290920877760 run_alphafold.py:376] Have 25 models: ['model_1pred_3', 'model_1_multimer_v2_pred_4', 'model_2_multimer_v2_pred_0', 'model_2_multim, 'model_3_multimer_v2_pred_0', 'model_3_multimer_v2_pred_1', 'model_3_multimer_v2_pl_4_multimer_v2_pred_1', 'model_4_multimer_v2_pred_2', 'model_4_multimer_v2_pred_3',timer_v2_pred_2', 'model_5_multimer_v2_pred_3', 'model_5_multimer_v2_pred_4']
I1023 16:33:23.785909 47290920877760 run_alphafold.py:393] Using random seed 2717477
I1023 16:33:23.786309 47290920877760 run_alphafold.py:161] Predicting multimer
I1023 16:33:23.959514 47290920877760 pipeline_multimer.py:210] Running monomer pipel
I1023 16:33:23.960077 47290920877760 jackhmmer.py:133] Launching subprocess "/home/f05 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /tmp/tmpmu_z1046.fasta
I1023 16:33:23.991682 47290920877760 utils.py:36] Started Jackhmmer (uniref90.fasta)
I1023 16:39:41.744481 47290920877760 utils.py:40] Finished Jackhmmer (uniref90.fasta
I1023 16:39:42.184468 47290920877760 jackhmmer.py:133] Launching subprocess "/home/f05 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /tmp/tmpmu_z1046.fasta
I1023 16:39:42.207394 47290920877760 utils.py:36] Started Jackhmmer (mgy_clusters_20
I1023 16:46:42.247396 47290920877760 utils.py:40] Finished Jackhmmer (mgy_clusters_2
I1023 16:46:43.379640 47290920877760 hmmbuild.py:121] Launching subprocess ['/home/fpjatb3j5u/query.msa']
I1023 16:46:43.421558 47290920877760 utils.py:36] Started hmmbuild query
I1023 16:46:44.075075 47290920877760 hmmbuild.py:128] hmmbuild stdout:
# hmmbuild :: profile HMM construction from multiple sequence alignments
# HMMER 3.3.2 (Nov 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# input alignment file:             /tmp/tmpjatb3j5u/query.msa
# output HMM file:                  /tmp/tmpjatb3j5u/output.hmm
# input alignment is asserted as:   protein
# model architecture construction:  hand-specified by RF annotation
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# idx name                  nseq  alen  mlen eff_nseq re/pos description
#---- -------------------- ----- ----- ----- -------- ------ -----------
1     query                 9218   799   191     7.65  0.590 

# CPU time: 0.58u 0.07s 00:00:00.64 Elapsed: 00:00:00.64


stderr:


I1023 16:46:44.075536 47290920877760 utils.py:40] Finished hmmbuild query in 0.654 s
I1023 16:46:44.081163 47290920877760 hmmsearch.py:103] Launching sub-process ['/home-F3', '0.1', '--incE', '100', '-E', '100', '--domE', '100', '--incdomE', '100', '-A'.txt']
I1023 16:46:44.144675 47290920877760 utils.py:36] Started hmmsearch (pdb_seqres.txt)
I1023 16:46:49.798766 47290920877760 utils.py:40] Finished hmmsearch (pdb_seqres.txt
Traceback (most recent call last):
  File "/home/fsd/alphafold-2.2.0/run_alphafold.py", line 422, in <module>
    app.run(main)
  File "/home/fsd/anaconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py",
    _run_main(main, args)
  File "/home/fsd/anaconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py",
    sys.exit(main(argv))
  File "/home/fsd/alphafold-2.2.0/run_alphafold.py", line 398, in main
    predict_structure(
  File "/home/fsd/alphafold-2.2.0/run_alphafold.py", line 172, in predict_structure
    feature_dict = data_pipeline.process(
  File "/home/fsd/alphafold-2.2.0/alphafold/data/pipeline_multimer.py", line 264, in
    chain_features = self._process_single_chain(
  File "/home/fsd/alphafold-2.2.0/alphafold/data/pipeline_multimer.py", line 212, in
    chain_features = self._monomer_data_pipeline.process(
  File "/home/fsd/alphafold-2.2.0/alphafold/data/pipeline.py", line 185, in process
    pdb_templates_result = self.template_searcher.query(msa_for_templates)
  File "/home/fsd/alphafold-2.2.0/alphafold/data/tools/hmmsearch.py", line 79, in qu
    return self.query_with_hmm(hmm)
  File "/home/fsd/alphafold-2.2.0/alphafold/data/tools/hmmsearch.py", line 112, in q
    raise RuntimeError(
RuntimeError: hmmsearch failed:
stdout:
# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.2 (Nov 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  /tmp/tmp2_sps4u7/query.hmm
# target sequence database:        /home/fsd/afdata//pdb_seqres/pdb_seqres.txt
# MSA of all hits saved to file:   /tmp/tmp2_sps4u7/output.sto
# show alignments in output:       no
# sequence reporting threshold:    E-value <= 100
# domain reporting threshold:      E-value <= 100
# sequence inclusion threshold:    E-value <= 100
# domain inclusion threshold:      E-value <= 100
# MSV filter P threshold:       <= 0.1
# Vit filter P threshold:       <= 0.1
# Fwd filter P threshold:       <= 0.1
# number of worker threads:        8
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       query  [M=191]


stderr:
Parse failed (sequence file /home/fsd/afdata//pdb_seqres/pdb_seqres.txt):
Line 1364572: illegal character 0

Jax Version and pocketfft

Facing issue while running alpha fold v2.2 and using jax==0.2.14 and jaxlib==0.3.10 and dm-haikuu==0.0.4

Traceback (most recent call last):
File "/home/datafiles/alphafold_data/alphafold/run_alphafold.py", line 33, in
from alphafold.model import data
File "/home/datafiles/alphafold_data/alphafold/alphafold/model/data.py", line 21, in
import haiku as hk
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/haiku/init.py", line 17, in
from haiku import data_structures
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/haiku/data_structures.py", line 17, in
from haiku._src.data_structures import to_immutable_dict
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/data_structures.py", line 30, in
from haiku._src import utils
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/haiku/_src/utils.py", line 24, in
import jax
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/init.py", line 108, in
from .experimental.maps import soft_pmap
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/experimental/maps.py", line 25, in
from .. import numpy as jnp
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/numpy/init.py", line 16, in
from . import fft
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/numpy/fft.py", line 17, in
from jax._src.numpy.fft import (
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/numpy/fft.py", line 19, in
from jax import lax
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/lax/init.py", line 330, in
from jax._src.lax.fft import (
File "/home/conda1/anaconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/fft.py", line 144, in
xla.backend_specific_translations['cpu'][fft_p] = pocketfft.pocketfft
AttributeError: module 'jaxlib.pocketfft' has no attribute 'pocketfft'

Please help

run only 1 of the 5 models for AF v2.2.0 monomer via bash command-line?

Hi,

We are running the bash run_alphafold.sh script in our non-docker installation via this repo (thanks for the efforts, if you leave us a name, address and t-shirt size, we will send you a t-shirt!).

Question: in the older version of the bash script, I was able to specify "model_1" for AF to run only 1 of the 5 models. Can I do the same in v2.2.0? What name is model_4? Thanks in advance.

how to run pTM flavour of AlphaFold2? (i.e to get PAE and TM)

Hi guys,

This is not per se' an issue but I've just used your framework to run non-docker AlphaFold2 with "full_dbs" preset for which I was expected in the output per-model .pkl files the PAE matrix and predicted TM-score which I did not get. My understanding was that running "full_dbs" preset runs the ptm network instead of casp14?!

Can anyone check this or confirm how to run the ptm version in AlphaFold2 with your main python script please?!

Thanks,
David

CUDA runtime implicit initialization on GPU:0 failed. Status: unrecognized error code

Everything was followed by the guideline except for the jax installation, becaue it would throw out the error of ValueError: jaxlib is version 0.1.69, but this version of jax requires version 0.1.74.. We therefore use pip3 install --upgrade jax jaxlib>=0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html to update it.

Then python run_alphafold_test.py is no problem, however bash run_alphafold.sh -d ../database-dir/ -o ../work-dir/ -f ../work-dir/T1050.fasta -t 2020-05-14 threw out a error as shown below:

$ bash run_alphafold.sh -d ../database-dir/ -o ../work-dir/ -f ../work-dir/T1050.fasta -t 2020-05-14
2022-01-24 11:28:29.659453: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
I0124 11:28:31.304372 140384586483520 templates.py:857] Using precomputed obsolete pdbs ../database-dir//pdb_mmcif/obsolete.dat.
I0124 11:28:31.476207 140384586483520 xla_bridge.py:244] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
I0124 11:28:31.641233 140384586483520 xla_bridge.py:244] Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
I0124 11:28:37.127628 140384586483520 run_alphafold.py:384] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5']
I0124 11:28:37.127787 140384586483520 run_alphafold.py:397] Using random seed 324245886155445948 for the data pipeline
I0124 11:28:37.127987 140384586483520 run_alphafold.py:150] Predicting T1050
I0124 11:28:37.128321 140384586483520 jackhmmer.py:130] Launching subprocess "/home/aaron/bin/miniconda3/envs/alphafold_conda/bin/jackhmmer -o /dev/null -A /tmp/tmpomb2yx3m/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ../work-dir/T1050.fasta ../database-dir//uniref90/uniref90.fasta"
I0124 11:28:37.304794 140384586483520 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0124 11:33:27.010351 140384586483520 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 289.705 seconds
I0124 11:33:35.188940 140384586483520 jackhmmer.py:130] Launching subprocess "/home/aaron/bin/miniconda3/envs/alphafold_conda/bin/jackhmmer -o /dev/null -A /tmp/tmp_s62488h/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 ../work-dir/T1050.fasta ../database-dir//mgnify/mgy_clusters_2018_12.fa"
I0124 11:33:35.408562 140384586483520 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0124 11:38:51.483255 140384586483520 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 316.074 seconds
I0124 11:39:20.663236 140384586483520 hhsearch.py:85] Launching subprocess "/home/aaron/bin/miniconda3/envs/alphafold_conda/bin/hhsearch -i /tmp/tmp_t0wr8md/query.a3m -o /tmp/tmp_t0wr8md/output.hhr -maxseq 1000000 -d ../database-dir//pdb70/pdb70"
I0124 11:39:20.892679 140384586483520 utils.py:36] Started HHsearch query
I0124 11:40:41.805876 140384586483520 utils.py:40] Finished HHsearch query in 80.913 seconds
I0124 11:42:19.997202 140384586483520 hhblits.py:128] Launching subprocess "/home/aaron/bin/miniconda3/envs/alphafold_conda/bin/hhblits -i ../work-dir/T1050.fasta -cpu 4 -oa3m /tmp/tmpdlpm19oi/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d ../database-dir//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d ../database-dir//uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0124 11:42:20.263012 140384586483520 utils.py:36] Started HHblits query
I0124 12:37:39.983841 140384586483520 utils.py:40] Finished HHblits query in 3319.720 seconds
I0124 12:37:40.498109 140384586483520 templates.py:878] Searching for template for: MASQSYLFKHLEVSDGLSNNSVNTIYKDRDGFMWFGTTTGLNRYDGYTFKIYQHAENEPGSLPDNYITDIVEMPDGRFWINTARGYVLFDKERDYFITDVTGFMKNLESWGVPEQVFVDREGNTWLSVAGEGCYRYKEGGKRLFFSYTEHSLPEYGVTQMAECSDGILLIYNTGLLVCLDRATLAIKWQSDEIKKYIPGGKTIELSLFVDRDNCIWAYSLMGIWAYDCGTKSWRTDLTGIWSSRPDVIIHAVAQDIEGRIWVGKDYDGIDVLEKETGKVTSLVAHDDNGRSLPHNTIYDLYADRDGVMWVGTYKKGVSYYSESIFKFNMYEWGDITCIEQADEDRLWLGTNDHGILLWNRSTGKAEPFWRDAEGQLPNPVVSMLKSKDGKLWVGTFNGGLYCMNGSQVRSYKEGTGNALASNNVWALVEDDKGRIWIASLGGGLQCLEPLSGTFETYTSNNSALLENNVTSLCWVDDNTLFFGTASQGVGTMDMRTREIKKIQGQSDSMKLSNDAVNHVYKDSRGLVWIATREGLNVYDTRRHMFLDLFPVVEAKGNFIAAITEDQERNMWVSTSRKVIRVTVASDGKGSYLFDSRAYNSEDGLQNCDFNQRSIKTLHNGIIAIGGLYGVNIFAPDHIRYNKMLPNVMFTGLSLFDEAVKVGQSYGGRVLIEKELNDVENVEFDYKQNIFSVSFASDNYNLPEKTQYMYKLEGFNNDWLTLPVGVHNVTFTNLAPGKYVLRVKAINSDGYVGIKEATLGIVVNPPFKLAAALQHHHHHH
I0124 12:37:42.310647 140384586483520 templates.py:267] Found an exact template match 4a2m_B.
I0124 12:37:44.813723 140384586483520 templates.py:267] Found an exact template match 4a2l_F.
I0124 12:37:46.607977 140384586483520 templates.py:267] Found an exact template match 3v9f_B.
I0124 12:37:47.402505 140384586483520 templates.py:267] Found an exact template match 3va6_A.
I0124 12:37:48.522100 140384586483520 templates.py:267] Found an exact template match 3ott_B.
I0124 12:37:48.918042 140384586483520 templates.py:267] Found an exact template match 5m11_A.
I0124 12:37:48.945314 140384586483520 templates.py:267] Found an exact template match 4a2m_B.
I0124 12:37:48.974633 140384586483520 templates.py:267] Found an exact template match 4a2l_F.
I0124 12:37:49.003993 140384586483520 templates.py:267] Found an exact template match 4a2m_B.
I0124 12:37:49.033177 140384586483520 templates.py:267] Found an exact template match 4a2l_F.
I0124 12:37:49.062488 140384586483520 templates.py:267] Found an exact template match 5m11_A.
I0124 12:37:49.089267 140384586483520 templates.py:267] Found an exact template match 3v9f_B.
I0124 12:37:49.118885 140384586483520 templates.py:267] Found an exact template match 3ott_B.
I0124 12:37:49.148563 140384586483520 templates.py:267] Found an exact template match 3va6_A.
I0124 12:37:49.178556 140384586483520 templates.py:267] Found an exact template match 3ott_B.
I0124 12:37:49.207692 140384586483520 templates.py:267] Found an exact template match 3va6_A.
I0124 12:37:49.237412 140384586483520 templates.py:267] Found an exact template match 5m11_A.
I0124 12:37:49.264744 140384586483520 templates.py:267] Found an exact template match 4a2m_B.
I0124 12:37:49.293939 140384586483520 templates.py:267] Found an exact template match 4a2l_F.
I0124 12:37:49.322692 140384586483520 templates.py:267] Found an exact template match 3v9f_B.
I0124 12:37:51.470793 140384586483520 pipeline.py:221] Uniref90 MSA size: 10000 sequences.
I0124 12:37:51.470931 140384586483520 pipeline.py:222] BFD MSA size: 4966 sequences.
I0124 12:37:51.470967 140384586483520 pipeline.py:223] MGnify MSA size: 501 sequences.
I0124 12:37:51.471006 140384586483520 pipeline.py:224] Final (deduplicated) MSA size: 15406 sequences.
I0124 12:37:51.471178 140384586483520 pipeline.py:226] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 20.
I0124 12:37:52.176468 140384586483520 run_alphafold.py:185] Running model model_1 on T1050
2022-01-24 12:37:54.502811: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-01-24 12:37:54.504019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:02:00.0 name: RTX A6000 computeCapability: 8.6
coreClock: 1.8GHz coreCount: 84 deviceMemorySize: 47.54GiB deviceMemoryBandwidth: 715.34GiB/s
2022-01-24 12:37:54.504059: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-01-24 12:37:54.505758: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-01-24 12:37:54.505845: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-01-24 12:37:54.505870: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-01-24 12:37:54.506045: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-01-24 12:37:54.507774: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-01-24 12:37:54.508212: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-01-24 12:37:54.508344: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-01-24 12:37:54.510542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2022-01-24 12:37:54.559168: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-01-24 12:37:54.563017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:02:00.0 name: RTX A6000 computeCapability: 8.6
coreClock: 1.8GHz coreCount: 84 deviceMemorySize: 47.54GiB deviceMemoryBandwidth: 715.34GiB/s
2022-01-24 12:37:54.565146: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2022-01-24 12:37:54.565185: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-01-24 12:37:54.626365: E tensorflow/core/common_runtime/session.cc:91] Failed to create session: Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unrecognized error code
2022-01-24 12:37:54.626381: E tensorflow/c/c_api.cc:2193] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unrecognized error code
Traceback (most recent call last):
  File "/mnt/disk4T/alphafold-project/alphafold_conda/run_alphafold.py", line 427, in <module>
    app.run(main)
  File "/home/aaron/bin/miniconda3/envs/alphafold_conda/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/aaron/bin/miniconda3/envs/alphafold_conda/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/mnt/disk4T/alphafold-project/alphafold_conda/run_alphafold.py", line 403, in main
    predict_structure(
  File "/mnt/disk4T/alphafold-project/alphafold_conda/run_alphafold.py", line 188, in predict_structure
    processed_feature_dict = model_runner.process_features(
  File "/mnt/disk4T/alphafold-project/alphafold_conda/alphafold/model/model.py", line 131, in process_features
    return features.np_example_to_features(
  File "/mnt/disk4T/alphafold-project/alphafold_conda/alphafold/model/features.py", line 101, in np_example_to_features
    with tf.Session(graph=tf_graph) as sess:
  File "/home/aaron/bin/miniconda3/envs/alphafold_conda/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1596, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/home/aaron/bin/miniconda3/envs/alphafold_conda/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 711, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: unrecognized error code

The code is run on ubuntu 20.04, and nvidia-smi information is :

Mon Jan 24 15:10:12 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00   Driver Version: 460.106.00   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  RTX A6000           On   | 00000000:02:00.0  On |                  Off |
| 30%   30C    P8    23W / 300W |    206MiB / 48676MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1778      G   /usr/lib/xorg/Xorg                147MiB |
|    0   N/A  N/A     13768      G   /usr/bin/gnome-shell               32MiB |
|    0   N/A  N/A    969147      G   ...nlogin/bin/sunloginclient        6MiB |
|    0   N/A  N/A   1048268      G   ...AAAAAAAAA= --shared-files       17MiB |
+-----------------------------------------------------------------------------+

Execution of replica 0 failed: Internal: CUBLAS_STATUS_EXECUTION_FAILED

This could be unrelated to this repo and instead be just some sort of drivers issue, but I'll post the error just in case someone can help.

We've installed this repo in an Ubuntu 21.04 Laptop with Thunderbolt and an eGPU with 2 Nvidia Quadro P1000 cards.

We kick off two parallel jobs, one on node 0 and another one on node 1, and they mostly go well, but after a few minutes/hours, sometimes one of the jobs gets stuck with the error below:

Any ideas wellcomed, thanks

I0930 07:04:52.753646 140368369108800 model.py:131] Running predict with shape(feat) = {'aatype': (4, 245), 'residue_index': (4, 245), 'seq_length': (4,), 'template_aatype': (4, 4, 245), 'template_all_atom_masks': (4, 4, 245, 37), 'template_all_atom_positions': (4, 4, 245, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 245), 'msa_mask': (4, 508, 245), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 245, 3), 'template_pseudo_beta_mask': (4, 4, 245), 'atom14_atom_exists': (4, 245, 14), 'residx_atom14_to_atom37': (4, 245, 14), 'residx_atom37_to_atom14': (4, 245, 37), 'atom37_atom_exists': (4, 245, 37), 'extra_msa': (4, 5120, 245), 'extra_msa_mask': (4, 5120, 245), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 245), 'true_msa': (4, 508, 245), 'extra_has_deletion': (4, 5120, 245), 'extra_deletion_value': (4, 5120, 245), 'msa_feat': (4, 508, 245, 49), 'target_feat': (4, 245, 22)}
2021-09-30 07:06:51.976947: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2040] Execution of replica 0 failed: Internal: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "/home/user/alphafold/run_alphafold.py", line 310, in <module>
    app.run(main)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/user/alphafold/run_alphafold.py", line 284, in main
    predict_structure(
  File "/home/user/alphafold/run_alphafold.py", line 149, in predict_structure
    prediction_result = model_runner.predict(processed_feature_dict)
  File "/home/user/alphafold/alphafold/model/model.py", line 133, in predict
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/api.py", line 411, in cache_miss
    out_flat = xla.xla_call(
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1618, in bind
    return call_bind(self, fun, *args, **params)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1609, in call_bind
    outs = primitive.process(top_trace, fun, tracers, params)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1621, in process
    return trace.process_call(self, fun, tracers, params)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 615, in process_call
    return primitive.impl(f, *tracers, **params)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 625, in _xla_call_impl
    out = compiled_fun(*args)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 960, in _execute_compiled
    out_bufs = compiled.execute(input_bufs)
jax._src.traceback_util.UnfilteredStackTrace: RuntimeError: Internal: CUBLAS_STATUS_EXECUTION_FAILED

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/alphafold/run_alphafold.py", line 310, in <module>
    app.run(main)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/user/alphafold/run_alphafold.py", line 284, in main
    predict_structure(
  File "/home/user/alphafold/run_alphafold.py", line 149, in predict_structure
    prediction_result = model_runner.predict(processed_feature_dict)
  File "/home/user/alphafold/alphafold/model/model.py", line 133, in predict
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)
  File "/home/user/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 960, in _execute_compiled
    out_bufs = compiled.execute(input_bufs)
RuntimeError: Internal: CUBLAS_STATUS_EXECUTION_FAILED
2021-09-30 07:06:53.025348: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:1039] could not synchronize on CUDA context: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered :: *** Begin stack trace ***






















        PyDict_SetItem
        _PyModule_ClearDict
        PyImport_Cleanup
        Py_FinalizeEx
        Py_RunMain
        Py_BytesMain
        __libc_start_main

*** End stack trace ***

2021-09-30 07:06:53.025585: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/gpu_executable.cc:99] Check failed: pair.first->SynchronizeAllActivity() 
Fatal Python error: Aborted

Taking a long time for predictions + more output files than expected

I've set up alphafold_non_docker and it appears to be running properly, but my tests have taken over 5 days without finishing yet, so I think something's probably going wrong - hopefully someone will have an idea! One is a multimer run of a relatively small antibody variable region (129 H residues, 108 L residues), and one is a test of a single chain antigen test with ~500 residues. I'm running on nodes with K80 GPUs and 125G memory. As an example, I'll attach the output of the antibody run to this post rather than copying it over (since it's quite long).

The corresponding command for the antibody run is: ./run_alphafold.sh -d /dartfs/rc/lab/G/Grigoryanlab/library/AlphaFoldEtc/alphafold_DBs/ -o /dartfs/rc/lab/G/Grigoryanlab/home/coy/Dartmouth_PhD_Repo/antibodyTestMC4/ -f /dartfs/rc/lab/G/Grigoryanlab/library/AlphaFoldEtc/antibodyTestMC.fasta -t 2021-10-04 -m multimer

My best guess as to why it's taking so long is that it's calling model.py more than it's supposed to? As far as I can tell, the README from alphafold indicates there should be 5 models (one from each seed) in the output, but I'm getting output that seems to indicate model.py has been called 23 times already, and there will probably be 25 models total when it finishes, according to the pattern of the models being produced. Here's a ls of my output directory for the antibody test:

features.pkl relaxed_model_5_multimer_v2_pred_4.pdb
msas result_model_1_multimer_v2_pred_0.pkl
ranked_0.pdb result_model_1_multimer_v2_pred_1.pkl
ranked_10.pdb result_model_1_multimer_v2_pred_2.pkl
ranked_11.pdb result_model_1_multimer_v2_pred_3.pkl
ranked_12.pdb result_model_1_multimer_v2_pred_4.pkl
ranked_13.pdb result_model_2_multimer_v2_pred_0.pkl
ranked_14.pdb result_model_2_multimer_v2_pred_1.pkl
ranked_15.pdb result_model_2_multimer_v2_pred_2.pkl
ranked_16.pdb result_model_2_multimer_v2_pred_3.pkl
ranked_17.pdb result_model_2_multimer_v2_pred_4.pkl
ranked_18.pdb result_model_3_multimer_v2_pred_0.pkl
ranked_19.pdb result_model_3_multimer_v2_pred_1.pkl
ranked_1.pdb result_model_3_multimer_v2_pred_2.pkl
ranked_20.pdb result_model_3_multimer_v2_pred_3.pkl
ranked_21.pdb result_model_3_multimer_v2_pred_4.pkl
ranked_22.pdb result_model_4_multimer_v2_pred_0.pkl
ranked_23.pdb result_model_4_multimer_v2_pred_1.pkl
ranked_24.pdb result_model_4_multimer_v2_pred_2.pkl
ranked_2.pdb result_model_4_multimer_v2_pred_3.pkl
ranked_3.pdb result_model_4_multimer_v2_pred_4.pkl
ranked_4.pdb result_model_5_multimer_v2_pred_0.pkl
ranked_5.pdb result_model_5_multimer_v2_pred_1.pkl
ranked_6.pdb result_model_5_multimer_v2_pred_2.pkl
ranked_7.pdb result_model_5_multimer_v2_pred_3.pkl
ranked_8.pdb result_model_5_multimer_v2_pred_4.pkl
ranked_9.pdb timings.json
ranking_debug.json unrelaxed_model_1_multimer_v2_pred_0.pdb
relaxed_model_1_multimer_v2_pred_0.pdb unrelaxed_model_1_multimer_v2_pred_1.pdb
relaxed_model_1_multimer_v2_pred_1.pdb unrelaxed_model_1_multimer_v2_pred_2.pdb
relaxed_model_1_multimer_v2_pred_2.pdb unrelaxed_model_1_multimer_v2_pred_3.pdb
relaxed_model_1_multimer_v2_pred_3.pdb unrelaxed_model_1_multimer_v2_pred_4.pdb
relaxed_model_1_multimer_v2_pred_4.pdb unrelaxed_model_2_multimer_v2_pred_0.pdb
relaxed_model_2_multimer_v2_pred_0.pdb unrelaxed_model_2_multimer_v2_pred_1.pdb
relaxed_model_2_multimer_v2_pred_1.pdb unrelaxed_model_2_multimer_v2_pred_2.pdb
relaxed_model_2_multimer_v2_pred_2.pdb unrelaxed_model_2_multimer_v2_pred_3.pdb
relaxed_model_2_multimer_v2_pred_3.pdb unrelaxed_model_2_multimer_v2_pred_4.pdb
relaxed_model_2_multimer_v2_pred_4.pdb unrelaxed_model_3_multimer_v2_pred_0.pdb
relaxed_model_3_multimer_v2_pred_0.pdb unrelaxed_model_3_multimer_v2_pred_1.pdb
relaxed_model_3_multimer_v2_pred_1.pdb unrelaxed_model_3_multimer_v2_pred_2.pdb
relaxed_model_3_multimer_v2_pred_2.pdb unrelaxed_model_3_multimer_v2_pred_3.pdb
relaxed_model_3_multimer_v2_pred_3.pdb unrelaxed_model_3_multimer_v2_pred_4.pdb
relaxed_model_3_multimer_v2_pred_4.pdb unrelaxed_model_4_multimer_v2_pred_0.pdb
relaxed_model_4_multimer_v2_pred_0.pdb unrelaxed_model_4_multimer_v2_pred_1.pdb
relaxed_model_4_multimer_v2_pred_1.pdb unrelaxed_model_4_multimer_v2_pred_2.pdb
relaxed_model_4_multimer_v2_pred_2.pdb unrelaxed_model_4_multimer_v2_pred_3.pdb
relaxed_model_4_multimer_v2_pred_3.pdb unrelaxed_model_4_multimer_v2_pred_4.pdb
relaxed_model_4_multimer_v2_pred_4.pdb unrelaxed_model_5_multimer_v2_pred_0.pdb
relaxed_model_5_multimer_v2_pred_0.pdb unrelaxed_model_5_multimer_v2_pred_1.pdb
relaxed_model_5_multimer_v2_pred_1.pdb unrelaxed_model_5_multimer_v2_pred_2.pdb
relaxed_model_5_multimer_v2_pred_2.pdb unrelaxed_model_5_multimer_v2_pred_3.pdb
relaxed_model_5_multimer_v2_pred_3.pdb unrelaxed_model_5_multimer_v2_pred_4.pdb

As you can see, the pattern for the output files is like: "relaxed_model_{0-4}multimer_v2_pred{0-4}.pdb". I'm not sure what the numbers 0-4 indicate; I'd assume one of them indicates models that come from the same starting seed, but I'm not sure what the other set of 0-4 would indicate / which one of the two places is the one that indicates a shared seed. Apologies if this is indicated somewhere and I've missed it! Thanks so much for any help on how to make this run faster / if the output is correct or not.

EDIT: I've since tried running the antibody test with CPUs only (no GPUs, so using the -e false -g false flags appended to the aforementioned command) and it takes ~16 hours. The GPU test recently finished and took 6 days in total! I was able to request 200 GB memory for the CPU test and only 125 GB memory for the GPU test, which might indicate memory is the limiting factor? I also updated the ls of the output dir above to have the final output files.

EDIT2: Large complexes (~2000 aa long) take a very long amount of time on CPU - about a month - and longer on GPU. Even with the same amount of memory, the GPU runs take longer than the CPU runs.

Fix CUDA_VISIBLE_DEVICES control

At the moment, the export of CUDA_VISIBLE_DEVICES=0 is hardcoded in the script,

export CUDA_VISIBLE_DEVICES=0

which is quite misleading in conjunction with the

echo "-a <gpu_devices> Comma separated list of devices to pass to 'NVIDIA_VISIBLE_DEVICES' (default: 'all')"

Moreover, the NVIDIA_VISIBLE_DEVICES environment variable applies only to containers and is not needed in the non-docker setup.

export NVIDIA_VISIBLE_DEVICES=$gpu_devices

This part can be modified like this to work as initially intended.

# Export ENVIRONMENT variables (change me if required)                                                                                                                                                                                         
if [[ "$use_gpu" == true ]] ; then                                                                                                                                                                                                             
    export CUDA_VISIBLE_DEVICES=0                                                                                                                                                                                                              
                                                                                                                                                                                                                                               
    if [[ "$gpu_devices" ]] ; then                                                                                                                                                                                                             
        export CUDA_VISIBLE_DEVICES=$gpu_devices                                                                                                                                                                                               
    fi                                                                                                                                                                                                                                         
fi                                        

I hope this will be helpful!

ValueError: Could not find HHsearch database ./alphafold_data/pdb70/pdb70

I installed alphafold_non_docker step by step, but I found an error as the following:

(alphafold) [root@ecs alphafold-2.2.0]# bash run_alphafold.sh -d ./alphafold_data -o ./dummy_test/ -f ./example/query.fasta -t 2020-05-14 -g False
E0704 11:23:00.766234 139973713471296 hhsearch.py:56] Could not find HHsearch database ./alphafold_data//pdb70/pdb70
Traceback (most recent call last):
  File "/root/alphafold-2.2.0/run_alphafold.py", line 422, in <module>
    app.run(main)
  File "/root/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/root/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/root/alphafold-2.2.0/run_alphafold.py", line 327, in main
    template_searcher = hhsearch.HHSearch(
  File "/root/alphafold-2.2.0/alphafold/data/tools/hhsearch.py", line 57, in __init__
    raise ValueError(f'Could not find HHsearch database {database_path}')
ValueError: Could not find HHsearch database ./alphafold_data/pdb70/pdb70

How to resolve this problem, I sincerely need u help. Thank u~

RuntimeError: HHSearch failed:

It seemed it was running.. until it wasnt.. although it produced some data meantime..

I was running it in an AWS instance with 60 cores, 477GiB RAM and 8 GPU

I have pasted the outputed logs below...
Any idea what problem could be?
Thank you

I runned it with this sample fasta file as query.fasta

T1050 A7LXT1, Bacteroides Ovatus, 779 residues|
MASQSYLFKHLEVSDGLSNNSVNTIYKDRDGFMWFGTTTGLNRYDGYTFKIYQHAENEPGSLPDNYITDIVEMPDGRFWINTARGYVLFDKERDYFITDVTGFMKNLESWGVPEQVFVDREGNTWLSVAGEGCYRYKEGGKRLFFSYTEHSLPEYGVTQMAECSDGILLIYNTGLLVCLDRATLAIKWQSDEIKKYIPGGKTIELSLFVDRDNCIWAYSLMGIWAYDCGTKSWRTDLTGIWSSRPDVIIHAVAQDIEGRIWVGKDYDGIDVLEKETGKVTSLVAHDDNGRSLPHNTIYDLYADRDGVMWVGTYKKGVSYYSESIFKFNMYEWGDITCIEQADEDRLWLGTNDHGILLWNRSTGKAEPFWRDAEGQLPNPVVSMLKSKDGKLWVGTFNGGLYCMNGSQVRSYKEGTGNALASNNVWALVEDDKGRIWIASLGGGLQCLEPLSGTFETYTSNNSALLENNVTSLCWVDDNTLFFGTASQGVGTMDMRTREIKKIQGQSDSMKLSNDAVNHVYKDSRGLVWIATREGLNVYDTRRHMFLDLFPVVEAKGNFIAAITEDQERNMWVSTSRKVIRVTVASDGKGSYLFDSRAYNSEDGLQNCDFNQRSIKTLHNGIIAIGGLYGVNIFAPDHIRYNKMLPNVMFTGLSLFDEAVKVGQSYGGRVLIEKELNDVENVEFDYKQNIFSVSFASDNYNLPEKTQYMYKLEGFNNDWLTLPVGVHNVTFTNLAPGKYVLRVKAINSDGYVGIKEATLGIVVNPPFKLAAALQHHHHHH

The data generated:

ubuntu@run-62387ab63902662cbe274d7c-4d7kq:/mnt$ tree -sh /mnt/example/
/mnt/example/
โ”œโ”€โ”€ [4.0K] dummy_test
โ”‚ โ””โ”€โ”€ [4.0K] query
โ”‚ โ””โ”€โ”€ [4.0K] msas
โ”‚ โ”œโ”€โ”€ [3.4M] mgnify_hits.sto
โ”‚ โ””โ”€โ”€ [ 72M] uniref90_hits.sto
โ””โ”€โ”€ [ 830] query.fasta_
3 directories, 3 files

ubuntu@run-62387ab63902662cbe274d7c-4d7kq:/app/alphafold$ sudo ./run_alphafold.sh -d /domino/datasets/af_download_data/ -o /mnt/example/dummy_test -f /mnt/example/query.fasta -t 2022-03-21
/opt/conda/lib/python3.7/site-packages/absl/flags/_validators.py:206: UserWarning: Flag --use_gpu_relax has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
'command line!' % flag_name)
I0321 13:29:02.076551 139820712200000 templates.py:857] Using precomputed obsolete pdbs /domino/datasets/af_download_data//pdb_mmcif/obsolete.dat.
I0321 13:29:03.170220 139820712200000 tpu_client.py:54] Starting the local TPU driver.
I0321 13:29:03.171494 139820712200000 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
I0321 13:29:05.166625 139820712200000 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0321 13:29:21.223274 139820712200000 run_alphafold.py:384] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
I0321 13:29:21.223868 139820712200000 run_alphafold.py:400] Using random seed 1019557854010524627 for the data pipeline
I0321 13:29:21.224538 139820712200000 run_alphafold.py:168] Predicting query
I0321 13:29:21.225994 139820712200000 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpa75vmfip/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/example/query.fasta /domino/datasets/af_download_data//uniref90/uniref90.fasta"
I0321 13:29:21.309449 139820712200000 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0321 13:37:13.182801 139820712200000 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 471.873 seconds
I0321 13:37:19.727575 139820712200000 jackhmmer.py:133] Launching subprocess "/usr/bin/jackhmmer -o /dev/null -A /tmp/tmpq_3sjpki/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /mnt/example/query.fasta /domino/datasets/af_download_data//mgnify/mgy_clusters_2018_12.fa"
I0321 13:37:19.829512 139820712200000 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0321 13:44:58.966890 139820712200000 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 459.137 seconds
I0321 13:45:22.831639 139820712200000 hhsearch.py:85] Launching subprocess "/usr/bin/hhsearch -i /tmp/tmpxw1gqa3o/query.a3m -o /tmp/tmpxw1gqa3o/output.hhr -maxseq 1000000 -d /domino/datasets/af_download_data//pdb70/pdb70"
I0321 13:45:22.918177 139820712200000 utils.py:36] Started HHsearch query
I0321 13:45:23.270786 139820712200000 utils.py:40] Finished HHsearch query in 0.352 seconds
Traceback (most recent call last):
File "/app/alphafold/run_alphafold.py", line 429, in
app.run(main)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/opt/conda/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/app/alphafold/run_alphafold.py", line 413, in main
random_seed=random_seed)
File "/app/alphafold/run_alphafold.py", line 181, in predict_structure
msa_output_dir=msa_output_dir)
File "/app/alphafold/alphafold/data/pipeline.py", line 188, in process
pdb_templates_result = self.template_searcher.query(uniref90_msa_as_a3m)
File "/app/alphafold/alphafold/data/tools/hhsearch.py", line 96, in query
stdout.decode('utf-8'), stderr[:100_000].decode('utf-8')))
RuntimeError: HHSearch failed:
stdout:

stderr:

FATAL Flags parsing error

I'm consistently running into this issue when trying to run the non-docker AP2 install. Followed directions and everything installed just fine. Downloaded run_alphafold.sh and tried to run the script, I consistently got the following error:

FATAL Flags parsing error: flag --use_gpu_relax=None: Flag --use_gpu_relax must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.

I don't know where this setting is coming from, since it's not part of the script. I'm running Ubuntu 20.04.3 LTS, cuda version 11.5 v11.5.119, and nvidia-smi returns the following:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A4000 On | 00000000:19:00.0 Off | Off |
| 41% 34C P8 8W / 140W | 13MiB / 16117MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA RTX A4000 On | 00000000:1A:00.0 Off | Off |
| 41% 36C P8 7W / 140W | 13MiB / 16117MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA RTX A4000 On | 00000000:67:00.0 Off | Off |
| 41% 34C P8 6W / 140W | 13MiB / 16117MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA RTX A4000 On | 00000000:68:00.0 On | Off |
| 41% 37C P8 9W / 140W | 352MiB / 16116MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

Errors encountered when trying to run multimer mode

Hi there, I tried to run multimer mode (monomer mode worked well), but encountered the following errors:
context 0x5618aefca000: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-02-07 22:20:55.956803: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:310] failed to allocate stream during initialization
2022-02-07 22:20:55.956815: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:618] unable to add host callback: CUDA_ERROR_INVALID_HANDLE: invalid resource handle
2022-02-07 22:20:55.956811: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:618] unable to add host callback: CUDA_ERROR_INVALID_HANDLE: invalid resource handle
2022-02-07 22:20:55.956826: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-02-07 22:20:55.956834: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:699] could not allocate CUDA stream for context 0x5618aefca000: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

I checked tensorflow, CUDA, etc. and it all seems to have been installed correctly. One thing I did notice is that only GPU 0 was used - even though I specified 0,1,2,3 in the commands, and also all 4 GPUs were visible. Are these issues related, and how do I resolve them?

database paths

Hello author,
In file run_alphafold.sh, database path section:

should
pdb70_database_path="$data_dir/pdb70/pdb70"
uniclust30_database_path="$data_dir/uniclust30/uniclust30_2018_08/uniclust30_2018_08"

be:
pdb70_database_path="$data_dir/pdb70"
uniclust30_database_path="$data_dir/uniclust30/uniclust30_2018_08"

Thank you!
Rong

cuSolver error

i was able to run alphafold on cpu successfully.
but when i am trying to run the same on gpu i am getting the below error

Traceback (most recent call last):
File "/home/ngayatri/alphafold/run_alphafold.py", line 310, in
app.run(main)
File "/home/ngayatri/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/ngayatri/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/ngayatri/alphafold/run_alphafold.py", line 284, in main
predict_structure(
File "/home/ngayatri/alphafold/run_alphafold.py", line 149, in predict_structure
prediction_result = model_runner.predict(processed_feature_dict)
File "/home/ngayatri/alphafold/alphafold/model/model.py", line 133, in predict
result = self.apply(self.params, jax.random.PRNGKey(0), feat)
File "/home/ngayatri/miniconda3/envs/alphafold/lib/python3.8/site-packages/jaxlib/cusolver.py", line 281, in syevd
lwork, opaque = cusolver_kernels.build_syevj_descriptor(
RuntimeError: cuSolver internal error

i have checked with cuda version also. it is present

thank you

Taking very long time for protein structure prediction

I have installed alphafold2 using the non-docker method on HPC, I am running the script using GPU (V100 with 16 GB of memory).
For a sequence of around 200 amino acids, It is taking around 8 hours for structure determination.
In Alphafold2 paper (https://www.nature.com/articles/s41586-021-03819-2.pdf), it is quoted that "Representative timings for the neural network using a single model on V100 GPU are 4.8 min with 256 residues, 9.2 min with 384 residues and 18 h at 2,500 residues".
I have pasted the outputed logs below.
Any idea why it is running very slow in my case although I am using the same GPU?

I0412 09:13:32.372322 140040161777472 tpu_client.py:54] Starting the local TPU driver.
I0412 09:13:32.539564 140040161777472 xla_bridge.py:212] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
I0412 09:13:32.890057 140040161777472 xla_bridge.py:212] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0412 09:13:38.549945 140040161777472 run_alphafold.py:376] Have 5 models: ['model_1_pred_0', 'model_2_pred_0', 'model_3_pred_0', 'model_4_pred_0', 'model_5_pred_0']
I0412 09:13:38.550210 140040161777472 run_alphafold.py:393] Using random seed 1060063058774185674 for the data pipeline
I0412 09:13:38.550483 140040161777472 run_alphafold.py:161] Predicting A0A016TJD3
I0412 09:13:38.566384 140040161777472 jackhmmer.py:133] Launching subprocess "/home/laddhadi/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpgcj2ht0b/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /scratch/laddhadi/fasta_files/A0A016TJD3.txt /scratch/laddhadi/alphafold_data//uniref90/uniref90.fasta"
I0412 09:13:38.641884 140040161777472 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0412 09:20:02.565351 140040161777472 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 383.923 seconds
I0412 09:20:02.754225 140040161777472 jackhmmer.py:133] Launching subprocess "/home/laddhadi/.conda/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpzagxg5ib/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /scratch/laddhadi/fasta_files/A0A016TJD3.txt /scratch/laddhadi/alphafold_data//mgnify/mgy_clusters_2018_12.fa"
I0412 09:20:02.801197 140040161777472 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0412 09:27:44.650721 140040161777472 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 461.849 seconds
I0412 09:27:46.027261 140040161777472 hhsearch.py:85] Launching subprocess "/home/laddhadi/.conda/envs/alphafold/bin/hhsearch -i /tmp/tmp7u0wxikv/query.a3m -o /tmp/tmp7u0wxikv/output.hhr -maxseq 1000000 -d /scratch/laddhadi/alphafold_data//pdb70/pdb70"
I0412 09:27:46.121132 140040161777472 utils.py:36] Started HHsearch query
I0412 09:34:32.186318 140040161777472 utils.py:40] Finished HHsearch query in 406.065 seconds
I0412 09:34:32.942785 140040161777472 hhblits.py:128] Launching subprocess "/home/laddhadi/.conda/envs/alphafold/bin/hhblits -i /scratch/laddhadi/fasta_files/A0A016TJD3.txt -cpu 4 -oa3m /tmp/tmp_07sbvqa/output.a3m -o /dev/null -n 3 -e 0.001 -maxseq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /scratch/laddhadi/alphafold_data//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt -d /scratch/laddhadi/alphafold_data//uniclust30/uniclust30_2018_08/uniclust30_2018_08"
I0412 09:34:33.030554 140040161777472 utils.py:36] Started HHblits query
I0409 18:56:30.223437 139865793787712 utils.py:40] Finished HHblits query in 16437.296 seconds
I0409 18:56:30.833186 139865793787712 templates.py:878] Searching for template for: MEAGGVADSLLSGACVLFTLGMFSSGLSDLRHMRMTRSVDNVQFLPFLTTDINNLSWLSYGALKGDGTLIIVNSVGAMLQTLYILVYLHYCPRKRGVLLQTAALLGVLLLGFGYFWLLVPDLEARLQWLGLFCSVFTISMYLSPLADLAKVIQTKSAQHFSFSLTIATLLASASWTLYGFRLKDPYITVPNFPGIVTSFIRLWLFWKYSQKPARNSQLLQT
I0409 18:56:30.946721 139865793787712 templates.py:267] Found an exact template match 5xpd_A.
I0409 18:56:31.633340 139865793787712 templates.py:267] Found an exact template match 5ctg_B.
I0409 18:56:31.643278 139865793787712 templates.py:267] Found an exact template match 5ctg_B.
I0409 18:56:31.652832 139865793787712 templates.py:267] Found an exact template match 5xpd_A.
I0409 18:56:31.664968 139865793787712 templates.py:267] Found an exact template match 5ctg_B.
I0409 18:56:31.674515 139865793787712 templates.py:267] Found an exact template match 5xpd_A.
I0409 18:56:32.157502 139865793787712 templates.py:267] Found an exact template match 4rng_D.
I0409 18:56:32.281505 139865793787712 templates.py:267] Found an exact template match 4x5m_B.
I0409 18:56:32.459235 139865793787712 templates.py:267] Found an exact template match 4qnd_A.
I0409 18:56:32.464134 139865793787712 templates.py:267] Found an exact template match 4qnd_A.
I0409 18:56:32.468961 139865793787712 templates.py:267] Found an exact template match 4rng_D.
I0409 18:56:32.473401 139865793787712 templates.py:267] Found an exact template match 4x5m_B.
I0409 18:56:32.609357 139865793787712 templates.py:267] Found an exact template match 5uhq_A.
I0409 18:56:32.688276 139865793787712 templates.py:267] Found an exact template match 4qnc_B.
I0409 18:56:32.692711 139865793787712 templates.py:267] Found an exact template match 5uhq_A.
I0409 18:56:32.697443 139865793787712 templates.py:267] Found an exact template match 4qnc_B.
I0409 18:56:32.701975 139865793787712 templates.py:267] Found an exact template match 5uhq_D.
I0409 18:56:32.706474 139865793787712 templates.py:267] Found an exact template match 5uhq_D.
I0409 18:56:32.710904 139865793787712 templates.py:718] hit 5j4i_B did not pass prefilter: Proportion of residues aligned to query too small. Align ratio: 0.04524886877828054.
I0409 18:56:32.710995 139865793787712 templates.py:912] Skipped invalid hit 5J4I_B Arginine/agmatine antiporter; AdiC, Transporter, Membrane Protein, Transport; 2.207A {Escherichia coli O157:H7}, error: None, warning: None
I0409 18:56:32.711062 139865793787712 templates.py:718] hit 3ob6_B did not pass prefilter: Proportion of residues aligned to query too small. Align ratio: 0.04072398190045249.
I0409 18:56:32.711109 139865793787712 templates.py:912] Skipped invalid hit 3OB6_B AdiC protein; Amino acid antiporter, Arginine, Membrane; HET: ARG; 3.0A {Escherichia coli}, error: None, warning: None
I0409 18:56:33.124143 139865793787712 pipeline.py:234] Uniref90 MSA size: 5963 sequences.
I0409 18:56:33.124355 139865793787712 pipeline.py:235] BFD MSA size: 1450 sequences.
I0409 18:56:33.124414 139865793787712 pipeline.py:236] MGnify MSA size: 135 sequences.
I0409 18:56:33.124469 139865793787712 pipeline.py:237] Final (deduplicated) MSA size: 7472 sequences.
I0409 18:56:33.124686 139865793787712 pipeline.py:239] Total number of templates (NB: this can include bad templates and is later filtered to top 4): 18.
I0409 18:56:33.153060 139865793787712 run_alphafold.py:190] Running model model_1_pred_0 on sweet_metazoan_F7D9S0
2022-04-09 18:56:37.475985: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-04-09 18:56:37.493497: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
I0409 18:56:43.118809 139865793787712 model.py:165] Running predict with shape(feat) = {'aatype': (4, 221), 'residue_index': (4, 221), 'seq_length': (4,), 'template_aatype': (4, 4, 221), 'template_all_atom_masks': (4, 4, 221, 37), 'template_all_atom_positions': (4, 4, 221, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 221), 'msa_mask': (4, 508, 221), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 221, 3), 'template_pseudo_beta_mask': (4, 4, 221), 'atom14_atom_exists': (4, 221, 14), 'residx_atom14_to_atom37': (4, 221, 14), 'residx_atom37_to_atom14': (4, 221, 37), 'atom37_atom_exists': (4, 221, 37), 'extra_msa': (4, 5120, 221), 'extra_msa_mask': (4, 5120, 221), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 221), 'true_msa': (4, 508, 221), 'extra_has_deletion': (4, 5120, 221), 'extra_deletion_value': (4, 5120, 221), 'msa_feat': (4, 508, 221, 49), 'target_feat': (4, 221, 22)}
2022-04-09 19:00:32.892367: E external/org_tensorflow/tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
********************************
Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Compiling module jit_apply_fn.149819
********************************
I0409 19:27:21.522277 139865793787712 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (221, 221, 64)}, 'experimentally_resolved': {'logits': (221, 37)}, 'masked_msa': {'logits': (508, 221, 23)}, 'predicted_lddt': {'logits': (221, 50)}, 'structure_module': {'final_atom_mask': (221, 37), 'final_atom_positions': (221, 37, 3)}, 'plddt': (221,), 'ranking_confidence': ()}
I0409 19:27:21.522848 139865793787712 run_alphafold.py:202] Total JAX model model_1_pred_0 on sweet_metazoan_F7D9S0 predict time (includes compilation time, see --benchmark): 1838.4s
I0409 19:27:26.533854 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:27:28.535824 139865793787712 amber_minimize.py:407] Minimizing protein, attempt 1 of 100.
I0409 19:27:29.179506 139865793787712 amber_minimize.py:68] Restraining 1742 / 3532 particles.
I0409 19:27:49.272111 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:27:53.467640 139865793787712 amber_minimize.py:497] Iteration completed: Einit 5198.28 Efinal -4264.63 Time 19.18 s num residue violations 0 num residue exclusions 0 
I0409 19:27:55.588373 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:27:56.617614 139865793787712 run_alphafold.py:190] Running model model_2_pred_0 on sweet_metazoan_F7D9S0
I0409 19:27:59.180104 139865793787712 model.py:165] Running predict with shape(feat) = {'aatype': (4, 221), 'residue_index': (4, 221), 'seq_length': (4,), 'template_aatype': (4, 4, 221), 'template_all_atom_masks': (4, 4, 221, 37), 'template_all_atom_positions': (4, 4, 221, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 221), 'msa_mask': (4, 508, 221), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 221, 3), 'template_pseudo_beta_mask': (4, 4, 221), 'atom14_atom_exists': (4, 221, 14), 'residx_atom14_to_atom37': (4, 221, 14), 'residx_atom37_to_atom14': (4, 221, 37), 'atom37_atom_exists': (4, 221, 37), 'extra_msa': (4, 1024, 221), 'extra_msa_mask': (4, 1024, 221), 'extra_msa_row_mask': (4, 1024), 'bert_mask': (4, 508, 221), 'true_msa': (4, 508, 221), 'extra_has_deletion': (4, 1024, 221), 'extra_deletion_value': (4, 1024, 221), 'msa_feat': (4, 508, 221, 49), 'target_feat': (4, 221, 22)}
2022-04-09 19:31:45.701855: E external/org_tensorflow/tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
********************************
Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Compiling module jit_apply_fn__1.149819
********************************
I0409 19:54:31.859675 139865793787712 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (221, 221, 64)}, 'experimentally_resolved': {'logits': (221, 37)}, 'masked_msa': {'logits': (508, 221, 23)}, 'predicted_lddt': {'logits': (221, 50)}, 'structure_module': {'final_atom_mask': (221, 37), 'final_atom_positions': (221, 37, 3)}, 'plddt': (221,), 'ranking_confidence': ()}
I0409 19:54:31.860479 139865793787712 run_alphafold.py:202] Total JAX model model_2_pred_0 on sweet_metazoan_F7D9S0 predict time (includes compilation time, see --benchmark): 1592.7s
I0409 19:54:35.127964 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:54:35.358972 139865793787712 amber_minimize.py:407] Minimizing protein, attempt 1 of 100.
I0409 19:54:35.695668 139865793787712 amber_minimize.py:68] Restraining 1742 / 3532 particles.
I0409 19:54:50.386341 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:54:50.959881 139865793787712 amber_minimize.py:497] Iteration completed: Einit 5700.67 Efinal -4254.93 Time 13.62 s num residue violations 0 num residue exclusions 0 
I0409 19:54:54.265672 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 19:55:01.368335 139865793787712 run_alphafold.py:190] Running model model_3_pred_0 on sweet_metazoan_F7D9S0
I0409 19:55:03.042567 139865793787712 model.py:165] Running predict with shape(feat) = {'aatype': (4, 221), 'residue_index': (4, 221), 'seq_length': (4,), 'is_distillation': (4,), 'seq_mask': (4, 221), 'msa_mask': (4, 512, 221), 'msa_row_mask': (4, 512), 'random_crop_to_size_seed': (4, 2), 'atom14_atom_exists': (4, 221, 14), 'residx_atom14_to_atom37': (4, 221, 14), 'residx_atom37_to_atom14': (4, 221, 37), 'atom37_atom_exists': (4, 221, 37), 'extra_msa': (4, 5120, 221), 'extra_msa_mask': (4, 5120, 221), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 512, 221), 'true_msa': (4, 512, 221), 'extra_has_deletion': (4, 5120, 221), 'extra_deletion_value': (4, 5120, 221), 'msa_feat': (4, 512, 221, 49), 'target_feat': (4, 221, 22)}
2022-04-09 19:58:10.149797: E external/org_tensorflow/tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
********************************
Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Compiling module jit_apply_fn__2.110442
********************************
I0409 20:22:45.222151 139865793787712 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (221, 221, 64)}, 'experimentally_resolved': {'logits': (221, 37)}, 'masked_msa': {'logits': (512, 221, 23)}, 'predicted_lddt': {'logits': (221, 50)}, 'structure_module': {'final_atom_mask': (221, 37), 'final_atom_positions': (221, 37, 3)}, 'plddt': (221,), 'ranking_confidence': ()}
I0409 20:22:45.222905 139865793787712 run_alphafold.py:202] Total JAX model model_3_pred_0 on sweet_metazoan_F7D9S0 predict time (includes compilation time, see --benchmark): 1662.2s
I0409 20:22:47.995997 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:22:48.224998 139865793787712 amber_minimize.py:407] Minimizing protein, attempt 1 of 100.
I0409 20:22:48.562712 139865793787712 amber_minimize.py:68] Restraining 1742 / 3532 particles.
I0409 20:23:07.021081 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:23:07.633584 139865793787712 amber_minimize.py:497] Iteration completed: Einit 5711.85 Efinal -4302.03 Time 17.35 s num residue violations 0 num residue exclusions 0 
I0409 20:23:10.685475 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:23:11.315222 139865793787712 run_alphafold.py:190] Running model model_4_pred_0 on sweet_metazoan_F7D9S0
I0409 20:23:13.004297 139865793787712 model.py:165] Running predict with shape(feat) = {'aatype': (4, 221), 'residue_index': (4, 221), 'seq_length': (4,), 'is_distillation': (4,), 'seq_mask': (4, 221), 'msa_mask': (4, 512, 221), 'msa_row_mask': (4, 512), 'random_crop_to_size_seed': (4, 2), 'atom14_atom_exists': (4, 221, 14), 'residx_atom14_to_atom37': (4, 221, 14), 'residx_atom37_to_atom14': (4, 221, 37), 'atom37_atom_exists': (4, 221, 37), 'extra_msa': (4, 5120, 221), 'extra_msa_mask': (4, 5120, 221), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 512, 221), 'true_msa': (4, 512, 221), 'extra_has_deletion': (4, 5120, 221), 'extra_deletion_value': (4, 5120, 221), 'msa_feat': (4, 512, 221, 49), 'target_feat': (4, 221, 22)}
I0409 20:50:54.579662 139865793787712 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (221, 221, 64)}, 'experimentally_resolved': {'logits': (221, 37)}, 'masked_msa': {'logits': (512, 221, 23)}, 'predicted_lddt': {'logits': (221, 50)}, 'structure_module': {'final_atom_mask': (221, 37), 'final_atom_positions': (221, 37, 3)}, 'plddt': (221,), 'ranking_confidence': ()}
I0409 20:50:54.580522 139865793787712 run_alphafold.py:202] Total JAX model model_4_pred_0 on sweet_metazoan_F7D9S0 predict time (includes compilation time, see --benchmark): 1661.6s
I0409 20:50:57.246246 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:50:57.478736 139865793787712 amber_minimize.py:407] Minimizing protein, attempt 1 of 100.
I0409 20:50:57.835792 139865793787712 amber_minimize.py:68] Restraining 1742 / 3532 particles.
I0409 20:51:16.007959 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:51:16.629822 139865793787712 amber_minimize.py:497] Iteration completed: Einit 5922.10 Efinal -4334.91 Time 15.28 s num residue violations 0 num residue exclusions 0 
I0409 20:51:18.662594 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 20:51:19.123664 139865793787712 run_alphafold.py:190] Running model model_5_pred_0 on sweet_metazoan_F7D9S0
I0409 20:51:21.032859 139865793787712 model.py:165] Running predict with shape(feat) = {'aatype': (4, 221), 'residue_index': (4, 221), 'seq_length': (4,), 'is_distillation': (4,), 'seq_mask': (4, 221), 'msa_mask': (4, 512, 221), 'msa_row_mask': (4, 512), 'random_crop_to_size_seed': (4, 2), 'atom14_atom_exists': (4, 221, 14), 'residx_atom14_to_atom37': (4, 221, 14), 'residx_atom37_to_atom14': (4, 221, 37), 'atom37_atom_exists': (4, 221, 37), 'extra_msa': (4, 1024, 221), 'extra_msa_mask': (4, 1024, 221), 'extra_msa_row_mask': (4, 1024), 'bert_mask': (4, 512, 221), 'true_msa': (4, 512, 221), 'extra_has_deletion': (4, 1024, 221), 'extra_deletion_value': (4, 1024, 221), 'msa_feat': (4, 512, 221, 49), 'target_feat': (4, 221, 22)}
2022-04-09 20:54:27.613558: E external/org_tensorflow/tensorflow/compiler/xla/service/slow_operation_alarm.cc:55] 
********************************
Very slow compile?  If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
Compiling module jit_apply_fn__4.110442
********************************
I0409 21:14:56.327582 139865793787712 model.py:175] Output shape was {'distogram': {'bin_edges': (63,), 'logits': (221, 221, 64)}, 'experimentally_resolved': {'logits': (221, 37)}, 'masked_msa': {'logits': (512, 221, 23)}, 'predicted_lddt': {'logits': (221, 50)}, 'structure_module': {'final_atom_mask': (221, 37), 'final_atom_positions': (221, 37, 3)}, 'plddt': (221,), 'ranking_confidence': ()}
I0409 21:14:56.328574 139865793787712 run_alphafold.py:202] Total JAX model model_5_pred_0 on sweet_metazoan_F7D9S0 predict time (includes compilation time, see --benchmark): 1415.3s
I0409 21:14:58.848648 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 21:14:59.077749 139865793787712 amber_minimize.py:407] Minimizing protein, attempt 1 of 100.
I0409 21:14:59.415221 139865793787712 amber_minimize.py:68] Restraining 1742 / 3532 particles.
I0409 21:15:20.303997 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 21:15:20.877890 139865793787712 amber_minimize.py:497] Iteration completed: Einit 5946.69 Efinal -4265.18 Time 18.49 s num residue violations 0 num residue exclusions 0 
I0409 21:15:22.856115 139865793787712 amber_minimize.py:177] alterations info: {'nonstandard_residues': [], 'removed_heterogens': set(), 'missing_residues': {}, 'missing_heavy_atoms': {}, 'missing_terminals': {<Residue 220 (THR) of chain 0>: ['OXT']}, 'Se_in_MET': [], 'removed_chains': {0: []}}
I0409 21:15:23.404444 139865793787712 run_alphafold.py:271] Final timings for sweet_metazoan_F7D9S0: {'features': 17525.107759714127, 'process_features_model_1_pred_0': 9.964778423309326, 'predict_and_compile_model_1_pred_0': 1838.4048104286194, 'relax_model_1_pred_0': 34.420247316360474, 'process_features_model_2_pred_0': 2.56180739402771, 'predict_and_compile_model_2_pred_0': 1592.6808323860168, 'relax_model_2_pred_0': 22.32586359977722, 'process_features_model_3_pred_0': 1.6738569736480713, 'predict_and_compile_model_3_pred_0': 1662.180498123169, 'relax_model_3_pred_0': 25.30202007293701, 'process_features_model_4_pred_0': 1.6887319087982178, 'predict_and_compile_model_4_pred_0': 1661.5763659477234, 'relax_model_4_pred_0': 24.005033254623413, 'process_features_model_5_pred_0': 1.9084506034851074, 'predict_and_compile_model_5_pred_0': 1415.2962276935577, 'relax_model_5_pred_0': 26.46107578277588}

Components of out file generated by HPC:

SLURM_JOB_NAME = af_g_new1
SLURM_JOB_NODELIST = gpu001
SLURM_JOB_UID = 15585
SLURM_JOB_PARTITION = gpu
SLURM_TASK_PID = 33813
SLURM_CPUS_ON_NODE = 40
SLURM_NTASKS = 40
SLURM_TASK_PID = 33813

timings.json file output

{
    "features": 26155.667644023895,
    "process_features_model_1_pred_0": 9.124100923538208,
    "predict_and_compile_model_1_pred_0": 1900.6409442424774,
    "relax_model_1_pred_0": 35.85753798484802,
    "process_features_model_2_pred_0": 2.5820021629333496,
    "predict_and_compile_model_2_pred_0": 1627.4828066825867,
    "relax_model_2_pred_0": 25.265653133392334,
    "process_features_model_3_pred_0": 1.7214865684509277,
    "predict_and_compile_model_3_pred_0": 1700.9324979782104,
    "relax_model_3_pred_0": 35.66818594932556,
    "process_features_model_4_pred_0": 1.80977201461792,
    "predict_and_compile_model_4_pred_0": 1702.664762020111,
    "relax_model_4_pred_0": 25.467517852783203,
    "process_features_model_5_pred_0": 1.7546741962432861,
    "predict_and_compile_model_5_pred_0": 1457.488024711609,
    "relax_model_5_pred_0": 25.19716238975525
}

Thank you,
Aditi

Permission denied

sudo ./run_alphafold.sh -d /home/panfulu/alphafold_database -o ./ -f P00519-2.fasta -t 2020-05-14

/var/tmp/sclLAKGtd: line 8: ./run_alphafold.sh: Permission denied

Wrong variable names in download_database.sh

Hello,

On line 77, you forgot the $ for uniprot and pdb_seqres mkdir

mkdir "$params" "$mgnify" "$pdb70" "$pdb_mmcif" "$mmcif_download_dir" "$mmcif_files" "$uniclust30" "$uniref90" "uniprot" "pdb_seqres"

should be

mkdir "$params" "$mgnify" "$pdb70" "$pdb_mmcif" "$mmcif_download_dir" "$mmcif_files" "$uniclust30" "$uniref90" "$uniprot" "$pdb_seqres"

Greetings,
David

alphafold.data.tools

Hi After the installation I get error as:

Traceback (most recent call last):
File "/gpfs/home/js12009/Projects/Projects/AlphaFold_TEST/AlphaFold_2_2_0/Basic/run_alphafold.py", line 31, in
from alphafold.data import pipeline
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold2.2/lib/python3.8/site-packages/alphafold/data/pipeline.py", line 26, in
from alphafold.data import templates
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold2.2/lib/python3.8/site-packages/alphafold/data/templates.py", line 31, in
from alphafold.data.tools import kalign
ModuleNotFoundError: No module named 'alphafold.data.tools'

is there a python or conda library I am missing in the installation?
Previous steps:
from alphafold.common import protein
from alphafold.common import residue_constants
seem to work.

Conflicting Conda packages

First - thanks for the recipe!

During the deployment in the section with dependencies I run into a small problem:

https://github.com/kalininalab/alphafold_non_docker#install-dependencies

conda install -y -c anaconda cudnn==8.2.1
conda install -y -c bioconda hmmer hhsuite==3.3.0 kalign2
conda install -y -c conda-forge openmm==7.5.1 cudatoolkit==11.0.3 pdbfixer

installation of cudnn already pulls cudatoolkit as a dependency:


  added / updated specs:
    - cudnn==8.2.1


The following NEW packages will be INSTALLED:

  cudatoolkit        pkgs/main/linux-64::cudatoolkit-11.3.1-h2bc3f7f_2
  cudnn              pkgs/main/linux-64::cudnn-8.2.1-cuda11.3_0

which later conflicts with the
conda install -y -c conda-forge openmm==7.5.1 cudatoolkit==11.0.3 pdbfixer

Seems that the proper way would be to use both cudnn and cudatoolkit from the same repo - conda-forge. I wonder though where the cudnn dependency came from because it is not mentioned in the reference Dockerfile.

alphafold only uses one gpu

I followed the instruction and successfully installed alphafold on cluster. It partically works, but only one gpu get used.

I added some code in scripts. Logs showed tensorflow did discover 2 gpu, but nvidia-smi revealed data and computing occupied at gpu 0, gpu 1 was idle.

here is log:

$HOME/.local/lib/python3.8/site-packages/absl/flags/_validators.py:203: UserWarning: Flag --preset has a non-None default value; therefore, mark_flag_as_required will pass even if flag is not specified in the command line!
  warnings.warn(
I0813 13:45:04.035152 140042046441280 templates.py:837] Using precomputed obsolete pdbs $DATA/pdb_mmcif/obsolete.dat.
I0813 13:45:05.206957 140042046441280 tpu_client.py:54] Starting the local TPU driver.
I0813 13:45:05.239395 140042046441280 xla_bridge.py:214] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: local://
I0813 13:45:05.629722 140042046441280 xla_bridge.py:214] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
I0813 13:45:14.966193 140042046441280 run_alphafold.ano.py:284] Have 5 models: ['model_1', 'model_2', 'model_3', 'model_4', 'model_5']
I0813 13:45:14.966413 140042046441280 run_alphafold.ano.py:297] Using random seed 8606097073378666681 for the data pipeline
I0813 13:45:15.419880 140042046441280 run_alphafold.ano.py:155] Running model model_1
2021-08-13 13:46:07.772502: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 6942677504 exceeds 10% of free system memory.
I0813 13:46:09.743540 140042046441280 model.py:145] Running predict with shape(feat) = {'aatype': (32, 2179), 'residue_index': (32, 2179), 'seq_length': (32,), 'template_aatype': (32, 4, 2179), 'template_all_atom_masks': (32, 4, 2179, 37), 'template_all_atom_positions': (32, 4, 2179, 37, 3), 'template_sum_probs': (32, 4, 1), 'is_distillation': (32,), 'seq_mask': (32, 2179), 'msa_mask': (32, 508, 2179), 'msa_row_mask': (32, 508), 'random_crop_to_size_seed': (32, 2), 'template_mask': (32, 4), 'template_pseudo_beta': (32, 4, 2179, 3), 'template_pseudo_beta_mask': (32, 4, 2179), 'atom14_atom_exists': (32, 2179, 14), 'residx_atom14_to_atom37': (32, 2179, 14), 'residx_atom37_to_atom14': (32, 2179, 37), 'atom37_atom_exists': (32, 2179, 37), 'extra_msa': (32, 5120, 2179), 'extra_msa_mask': (32, 5120, 2179), 'extra_msa_row_mask': (32, 5120), 'bert_mask': (32, 508, 2179), 'true_msa': (32, 508, 2179), 'extra_has_deletion': (32, 5120, 2179), 'extra_deletion_value': (32, 5120, 2179), 'msa_feat': (32, 508, 2179, 49), 'target_feat': (32, 2179, 22)}
2021-08-13 13:49:42.988439: W external/org_tensorflow/tensorflow/core/common_runtime/bfc_allocator.cc:457] Allocator (GPU_0_bfc) ran out of memory trying to allocate 39.13GiB (rounded to 42012920064)requested by op 
2021-08-13 13:49:42.991276: W external/org_tensorflow/tensorflow/core/common_runtime/bfc_allocator.cc:468] *******************************************************_____________________________________________
2021-08-13 13:49:42.991431: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2040] Execution of replica 0 failed: Resource exhausted: Out of memory while trying to allocate 42012919928 bytes.
visible gpus [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
visible gpus [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
visible gpus [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
visible gpus [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
visible gpus [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
running process_features
2021-08-13 13:45:15 running: process_features
Traceback (most recent call last):
  File "run_alphafold.ano.py", line 328, in <module>
    app.run(main)
  File "$HOME/.local/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "$HOME/.local/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "run_alphafold.ano.py", line 301, in main
    predict_structure(
  File "run_alphafold.ano.py", line 162, in predict_structure
    prediction_result = model_runner.predict(processed_feature_dict)
  File "$HOME/alphafold/alphafold-2.0/alphafold/model/model.py", line 147, in predict
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 183, in reraise_with_filtered_traceback
    return fun(*args, **kwargs)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/_src/api.py", line 399, in cache_miss
    out_flat = xla.xla_call(
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1561, in bind
    return call_bind(self, fun, *args, **params)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1552, in call_bind
    outs = primitive.process(top_trace, fun, tracers, params)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1564, in process
    return trace.process_call(self, fun, tracers, params)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 607, in process_call
    return primitive.impl(f, *tracers, **params)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 610, in _xla_call_impl
    return compiled_fun(*args)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 898, in _execute_compiled
    out_bufs = compiled.execute(input_bufs)
jax._src.traceback_util.UnfilteredStackTrace: RuntimeError: Resource exhausted: Out of memory while trying to allocate 42012919928 bytes.

The stack trace below excludes JAX-internal frames.
The preceding is the original exception that occurred, unmodified.

--------------------

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "run_alphafold.ano.py", line 328, in <module>
    app.run(main)
  File "$HOME/.local/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "$HOME/.local/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "run_alphafold.ano.py", line 301, in main
    predict_structure(
  File "run_alphafold.ano.py", line 162, in predict_structure
    prediction_result = model_runner.predict(processed_feature_dict)
  File "$HOME/alphafold/alphafold-2.0/alphafold/model/model.py", line 147, in predict
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)
  File "$HOME/.conda/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 898, in _execute_compiled
    out_bufs = compiled.execute(input_bufs)
RuntimeError: Resource exhausted: Out of memory while trying to allocate 42012919928 bytes.

here is nvidia-smi output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:0A.0 Off |                  Off |
| N/A   39C    P0    67W / 300W |  29754MiB / 32510MiB |     62%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:0B.0 Off |                  Off |
| N/A   39C    P0    55W / 300W |    496MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     52643      C   python                          29749MiB |
|    1   N/A  N/A     52643      C   python                            491MiB |
+-----------------------------------------------------------------------------+

No GPU/TPU found, falling back to CPU on GPU with 4Gb of RAM

I am trying to run the non-docker version of alphafold2 in this repo: I succeeded in doing so in an AWS GPU instance with a GPU of 16Gb of RAM, and for the proteins I am inputting, it peaks at around 3Gb of RAM utilisation, by looking at nvidia-smi while alphafold2 is running.

I am now trying the same with a laptop that has an Nvidia GPU with 4Gb of RAM (see info below), but so far, I am unable to make the same run_alphafold command to see the GPU. Any ideas?:


I0820 11:22:14.030323 140191155447616 templates.py:836] Using precomputed obsolete pdbs /bfx_share1/quick_share/alphafold2/db/pdb_mmcif/obsolete.dat.
I0820 11:22:14.230584 140191155447616 xla_bridge.py:236] Unable to initialize backend 'tpu_driver': Not found: Unable to find driver in registry given worker: 
2021-08-20 11:22:14.253180: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
I0820 11:22:14.253384 140191155447616 xla_bridge.py:236] Unable to initialize backend 'gpu': Failed precondition: No visible GPU devices.
I0820 11:22:14.253819 140191155447616 xla_bridge.py:236] Unable to initialize backend 'tpu': Invalid argument: TpuPlatform is not available.
W0820 11:22:14.253926 140191155447616 xla_bridge.py:240] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
I0820 11:22:15.007403 140191155447616 run_alphafold.py:259] Have 1 models: ['model_1']
I0820 11:22:15.007551 140191155447616 run_alphafold.py:272] Using random seed 3180855101326110185 for the data pipeline
I0820 11:22:15.008080 140191155447616 jackhmmer.py:130] Launching subprocess "/home/user/miniconda3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpjdujyngs/output.sto --noali --F1 0.0005 --F2 5e-05 --F
3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /home/user/alphafold/CL-1384189538793.fasta /bfx_share1/quick_share/alphafold2/db/uniref90/uniref90.fasta"
I0820 11:22:15.019448 140191155447616 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0820 11:22:16.779201 140191155447616 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 1.760 seconds
I0820 11:22:16.786322 140191155447616 jackhmmer.py:130] Launching subprocess "/home/user/miniconda3/envs/alphafold/bin/jackhmmer -o /dev/null -A /tmp/tmpvmikh78k/output.sto --noali --F1 0.0005 --F2 5e-05 --F
3 5e-07 --incE 0.0001 -E 0.0001 --cpu 8 -N 1 /home/user/alphafold/CL-1384189538793.fasta /bfx_share1/quick_share/alphafold2/db/mgnify/mgy_clusters.fa"
I0820 11:22:16.797401 140191155447616 utils.py:36] Started Jackhmmer (mgy_clusters.fa) query


$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001F91sv000017AAsd00003A41bc03sc00i00
vendor   : NVIDIA Corporation
model    : TU117M [GeForce GTX 1650 Mobile / Max-Q]
driver   : nvidia-driver-460-server - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-470 - distro non-free recommended
driver   : nvidia-driver-460 - distro non-free
driver   : nvidia-driver-418-server - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

$ nvidia-smi
Fri Aug 20 11:22:38 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   38C    P8     3W /  N/A |    148MiB /  3903MiB |      1%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1331      G   /usr/lib/xorg/Xorg                 55MiB |
|    0   N/A  N/A      1373      G   /usr/bin/sddm-greeter              88MiB |
+-----------------------------------------------------------------------------+

Couldn't get ptxas version string

Hi,

I install AF_non_docker as this git site. I think every thing goes smoothly in installation. When run 'bash run_alphafold.sh', a "Couldn't get ptxas version string" occurred. Any way to fix this issue?

2021-11-20 11:04:09.946403: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 184848424 exceeds 10% of free system memory.
2021-11-20 11:04:10.060729: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 184848424 exceeds 10% of free system memory.
2021-11-20 11:04:10.171764: W tensorflow/core/framework/cpu_allocator_impl.cc:80] Allocation of 184848424 exceeds 10% of free system memory.
I1120 11:04:10.312324 139757326907200 model.py:165] Running predict with shape(feat) = {'aatype': (4, 173), 'residue_index': (4, 173), 'seq_length': (4,), 'template_aatype': (4, 4, 173), 'template_all_atom_masks': (4, 4, 173, 37), 'template_all_atom_positions': (4, 4, 173, 37, 3), 'template_sum_probs': (4, 4, 1), 'is_distillation': (4,), 'seq_mask': (4, 173), 'msa_mask': (4, 508, 173), 'msa_row_mask': (4, 508), 'random_crop_to_size_seed': (4, 2), 'template_mask': (4, 4), 'template_pseudo_beta': (4, 4, 173, 3), 'template_pseudo_beta_mask': (4, 4, 173), 'atom14_atom_exists': (4, 173, 14), 'residx_atom14_to_atom37': (4, 173, 14), 'residx_atom37_to_atom14': (4, 173, 37), 'atom37_atom_exists': (4, 173, 37), 'extra_msa': (4, 5120, 173), 'extra_msa_mask': (4, 5120, 173), 'extra_msa_row_mask': (4, 5120), 'bert_mask': (4, 508, 173), 'true_msa': (4, 508, 173), 'extra_has_deletion': (4, 5120, 173), 'extra_deletion_value': (4, 5120, 173), 'msa_feat': (4, 508, 173, 49), 'target_feat': (4, 173, 22)}
2021-11-20 11:04:10.349723: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:81] Couldn't get ptxas version string: Internal: Couldn't invoke ptxas --version
2021-11-20 11:04:10.350581: F external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:479] ptxas returned an error during compilation of ptx to sass: 'Internal: Failed to launch ptxas' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
Fatal Python error: Aborted

Thread 0x00007f1bc9d33740 (most recent call first):
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 474 in backend_compile
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 863 in compile_or_get_cached
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 921 in from_xla_computation
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 892 in compile
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 759 in _xla_callable_uncached
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 439 in xla_primitive_callable
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/util.py", line 180 in cached
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/util.py", line 187 in wrapper
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 416 in apply_primitive
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 624 in process_primitive
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 272 in bind
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 408 in shift_right_logical
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 240 in threefry_seed
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/prng.py", line 202 in seed_with_impl
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/random.py", line 122 in PRNGKey
File "/mnt/mpathb/alphafold2/alphafold/alphafold/model/model.py", line 167 in predict
File "/mnt/mpathb/alphafold2/alphafold/run_alphafold.py", line 193 in predict_structure
File "/mnt/mpathb/alphafold2/alphafold/run_alphafold.py", line 403 in main
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main
File "/mnt/mpathb/alphafold2/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312 in run
File "/mnt/mpathb/alphafold2/alphafold/run_alphafold.py", line 427 in

Error with Amber minimization toggle while running in multimer mode

I have been working to get both monomer and multimer predictions running on my schools HPC, but I continue to run into the same error with amber relaxation:

Error initializing CUDA: CUDA error (34) at /home/conda/feedstock_root/build_artifacts/openmm_1622798701405/work/platforms/cuda/src/CudaContext.cpp:138
Traceback (most recent call last):
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/run_alphafold.py", line 422, in
app.run(main)
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/run_alphafold.py", line 398, in main
predict_structure(
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/run_alphafold.py", line 242, in predict_structure
relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein)
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/alphafold/relax/relax.py", line 61, in process
out = amber_minimize.run_pipeline(
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/alphafold/relax/amber_minimize.py", line 475, in run_pipeline
ret = _run_one_iteration(
File "/gpfs/share/apps/miniconda3/gpu/4.9.2/envs/alphafold220/alphafold-2.2.0/alphafold/relax/amber_minimize.py", line 419, in _run_one_iteration
raise ValueError(f"Minimization failed after {max_attempts} attempts.")
ValueError: Minimization failed after 100 attempts.

This error can be avoided in monomer predictions by setting the "-r" flag to false, but setting this flag to false in the multimer mode doesn't change anything and I receive the same error that I was getting before.

This is also a general issue in that amber relaxation will not work for either of the modes and if there is a solution, please let me know!

Unable to find CIF files

I am getting the following error:
ValueError: Could not find CIFs in /path/to/mmcif_files

I have checked and the files are present, is there a way around this?

another CUDA oddity

We've now installed alphafold_non_docker on a Linux system with an NVIDIA Quadro P1000 (4GB) but the system also has a 2GB NVIDIA card that appears as device 0 in nvidia-smi.

When attempting to use the bash script with -a 1, it actually used the smaller card and runs out of memory, which is expected for the input protein which peaks at 3Gb of RAM in another computer where this works successfully.

When attempting without the -a flag, or with the -a 0 flag, then it runs on the 4Gb device, which is listed as device 1 in nvidia-smi. It runs for a while, but at the prediction step, it crashes with this error:

You do not need to update to CUDA 9.2.88; cherry-picking the ptxas binary is sufficient.
2021-08-31 12:14:16.286331: W external/org_tensorflow/tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusolver.so.11'; dlerror: libcusolver.so.11: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH:
 :/usr/lib/oracle/12.2/client64/lib/:/usr/lib/oracle/12.2/client64
Traceback (most recent call last):                                                                                                                                                                                                                                             
  File "/home/user/alphafold/run_alphafold.py", line 302, in <module>                                                                                                                                                                                                      
    app.run(main)                                                                                                                                                                                                                                                              
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run                                                                                                                                                                             
    _run_main(main, args)                                                                                                                                                                                                                                                      
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main                                                                                                                                                                       
    sys.exit(main(argv))                                                                                                                                                                                                                                                       
  File "/home/user/alphafold/run_alphafold.py", line 276, in main                                                                                                                                                                                                          
    predict_structure(                                                                                                                                                                                                                                                         
  File "/home/user/alphafold/run_alphafold.py", line 148, in predict_structure                                                                                                                                                                                             
    prediction_result = model_runner.predict(processed_feature_dict)                                                                                                                                                                                                           
  File "/home/user/alphafold/alphafold/model/model.py", line 133, in predict                                                                                                                                                                                               
    result = self.apply(self.params, jax.random.PRNGKey(0), feat)                                                                                                                                                                                                              
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback                                                                                                                                  
    return fun(*args, **kwargs)                                                                                                                                                                                                                                                
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/api.py", line 405, in cache_miss                                                                                                                                                                  
    out_flat = xla.xla_call(                                                                                                                                                                                                                                                   
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1614, in bind                                                                                                                                                                           
    return call_bind(self, fun, *args, **params)                                                                                                                                                                                                                               
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1605, in call_bind                                                                                                                                                                      
    outs = primitive.process(top_trace, fun, tracers, params)                                                                                                                                                                                                                    File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 1617, in process   
    return trace.process_call(self, fun, tracers, params)                                                                                                                                                                                                                      
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/core.py", line 613, in process_call                                                                                                                                                                    
    return primitive.impl(f, *tracers, **params)                                                                        
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 619, in _xla_call_impl
    compiled_fun = _xla_callable(fun, device, backend, name, donated_invars,                                                                                                                                                                     
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/linear_util.py", line 262, in memoized_fun                                                                                                                                                             
    ans = call(fun, *args)                                                                                                                                                                                                                                                     
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 752, in _xla_callable                                      
    out_nodes = jaxpr_subcomp(                                                                                                              
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 487, in jaxpr_subcomp
    ans = rule(c, axis_env, extend_name_stack(name_stack, eqn.primitive.name),                                                                                                                                                                                                 
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/control_flow.py", line 350, in _while_loop_translation_rule                                                                                                                                   
    new_z = xla.jaxpr_subcomp(body_c, body_jaxpr.jaxpr, backend, axis_env,                                                 
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 487, in jaxpr_subcomp
    ans = rule(c, axis_env, extend_name_stack(name_stack, eqn.primitive.name),                               
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 1060, in f                                                                                                                                                                  
    outs = jaxpr_subcomp(c, jaxpr, backend, axis_env, _xla_consts(c, consts),                                                                                                                                                                                                  
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 487, in jaxpr_subcomp
    ans = rule(c, axis_env, extend_name_stack(name_stack, eqn.primitive.name),                                
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/control_flow.py", line 350, in _while_loop_translation_rule                                                                                                               
    new_z = xla.jaxpr_subcomp(body_c, body_jaxpr.jaxpr, backend, axis_env,                                                                                                                                                                                                     
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/interpreters/xla.py", line 478, in jaxpr_subcomp                                                                                                                                                       
    ans = rule(c, *in_nodes, **eqn.params)                                                                                                                                                                                                                                     
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jax/_src/lax/linalg.py", line 503, in _eigh_cpu_gpu_translation_rule             
    v, w, info = syevd_impl(c, operand, lower=lower)                                                                
  File "/data/miniconda3/envs/alphafold/lib/python3.8/site-packages/jaxlib/cusolver.py", line 281, in syevd                                                                                                                                                                    
    lwork, opaque = cusolver_kernels.build_syevj_descriptor(                                                                                                                                                                                                                   
jax._src.traceback_util.UnfilteredStackTrace: RuntimeError: cuSolver internal error                               
                                                                                                                                                                                                                                                                               

... 

This is with the usual sudo apt-get install nvidia-drivers-460 plus sudo apt-get install nvidia-cuda-toolkit method. Rebooting and sorting out the 'Secure Boot' malarkey was needed for this laptop.

EDIT: just to make sure that the smaller card wasn't a problem, we attempted to take the smaller card off the computer and reboot. Only the larger 4Gb card appeared in the list in nvidia-smi, however, he issue remained as described above when trying to run alphafold.

Any ideas what this libcusolver issue could be due to?

-m option

Hi! Thank you for this non docker implementation!

I'm trying to generate multiple models with alphafold; it is my understanding that this requires use of the -m option.

I can get the script to work if I input model_1 in the -m option, but if I try a comma separated list with more than model_1 (or even just input any other name other than model_1), I get the following error:

Traceback (most recent call last):
File "/apps/gb/AlphaFold/2.0.0/alphafold/run_alphafold.py", line 303, in
app.run(main)
File "/apps/gb/AlphaFold/2.0.0/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/apps/gb/AlphaFold/2.0.0/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/apps/gb/AlphaFold/2.0.0/alphafold/run_alphafold.py", line 253, in main
model_config = config.model_config(model_name)
File "/apps/gb/AlphaFold/2.0.0/alphafold/alphafold/model/config.py", line 32, in model_config
raise ValueError(f'Invalid model name {name}.')
ValueError: Invalid model name model_2.

Here's the command I'm running:
bash $ROOTALPHAFOLD/alphafold/run_alphafold.sh -d /db/AlphaFold -o ./test5/ -m model_1, model_2 -f ./fasta.fa -t 2020-05-14 -b True

jax update

Updating jax via:

pip install --upgrade jax jaxlib==0.1.69+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html

results is jax being upgraded to 0.2.26:

Collecting contextlib2
  Using cached contextlib2-21.6.0-py2.py3-none-any.whl (13 kB)
Collecting PyYAML
  Using cached PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)
Installing collected packages: PyYAML, contextlib2, ml-collections, jax
  Attempting uninstall: jax
    Found existing installation: jax 0.2.26
    Uninstalling jax-0.2.26:
      Successfully uninstalled jax-0.2.26
Successfully installed PyYAML-6.0 contextlib2-21.6.0 jax-0.2.14 ml-collections-0.1.0

...which causes the following error:

ValueError: jaxlib is version 0.1.69, but this version of jax requires version 0.1.74.

If jax must stay jax==0.2.14 (or at least < 0.2.26), then it appears that one must run pip install jax==0.2.14 after running pip install --upgrade jax

Minimization failed after 100 attempts.

The driver version is 11.7

image

The cudatoolkit version is 11.7
image

I0922 20:54:25.670873 139785783146304 amber_minimize.py:407] Minimizing protein, attempt 98 of 100.
I0922 20:54:27.095493 139785783146304 amber_minimize.py:68] Restraining 5073 / 10077 particles.
I0922 20:54:27.823423 139785783146304 amber_minimize.py:417] Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

I0922 20:54:27.830594 139785783146304 amber_minimize.py:407] Minimizing protein, attempt 99 of 100.
I0922 20:54:28.679339 139785783146304 amber_minimize.py:68] Restraining 5073 / 10077 particles.
I0922 20:54:29.414243 139785783146304 amber_minimize.py:417] Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

I0922 20:54:29.421343 139785783146304 amber_minimize.py:407] Minimizing protein, attempt 100 of 100.
I0922 20:54:30.978424 139785783146304 amber_minimize.py:68] Restraining 5073 / 10077 particles.
I0922 20:54:31.718168 139785783146304 amber_minimize.py:417] Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

Traceback (most recent call last):
File "/22t/chenhx/software/alphafold-2.2.0/run_alphafold.py", line 422, in
app.run(main)
File "/22t/chenhx/miniconda3/envs/testalphafold/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/22t/chenhx/miniconda3/envs/testalphafold/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/22t/chenhx/software/alphafold-2.2.0/run_alphafold.py", line 398, in main
predict_structure(
File "/22t/chenhx/software/alphafold-2.2.0/run_alphafold.py", line 242, in predict_structure
relaxed_pdb_str, _, _ = amber_relaxer.process(prot=unrelaxed_protein)
File "/22t/chenhx/software/alphafold-2.2.0/alphafold/relax/relax.py", line 61, in process
out = amber_minimize.run_pipeline(
File "/22t/chenhx/software/alphafold-2.2.0/alphafold/relax/amber_minimize.py", line 475, in run_pipeline
ret = _run_one_iteration(
File "/22t/chenhx/software/alphafold-2.2.0/alphafold/relax/amber_minimize.py", line 419, in _run_one_iteration
raise ValueError(f"Minimization failed after {max_attempts} attempts.")
ValueError: Minimization failed after 100 attempts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.