isayevlab / auto3d_pkg Goto Github PK

View Code? Open in Web Editor NEW

142.0 142.0 31.0 121.07 MB

Auto3D generates low-energy conformers from SMILES/SDF

License: MIT License

Python 52.87% Jupyter Notebook 47.13%

auto3d_pkg's People

Contributors

Stargazers

Watchers

auto3d_pkg's Issues

Question about geometry optimization

Hi,

I was wondering how the opt_geometry function is supposed to work with GPU? When I run it even with cuda available and right GPU index number it seems to run mainly on parallelized CPU. Is it supposed to be like this or am I missing something?

ionization states

Is it possible to include the ionization states in the output file of the 3D small molecules?

Protonation states and SMD solvent inclusion

Greetings and good day,

I am wondering if auto3d can enumerate potential protonation states, same way as tautomeric states are being enumerated, I could not find anything relevant in the auto3d paper.

I am also wondering if solvent effect has been implemented yet (SMD or similar), as according to my understanding from the paper, it was an 'under development' functionality. SMD is already implemented in AIMNet, which auto3d uses by default as an optimisation engine.

Thanks,
Marawan

Optimization engine did not run and no 3D structure converged.

I am working on Auto3D on Linux based workstation to optimize and to generate the 3D structures. I have given a 500 compounds in .smi format. Here i have provided the parameter file which i have used in my case.
parameter.txt
I have used this command to run Auto3D engine.

python3 auto3D.py parameters.yaml

After runing the above command it is showing this error.
RuntimeError: CUDA out of memory. Tried to allocate 4.27 GiB. GPU 1 has a total capacty of 15.74 GiB of which 11.88 MiB is free. Including non-PyTorch memory, this process has 15.71 GiB memory in use. Of the allocated memory 11.76 GiB is allocated by PyTorch, and 3.81 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
A clear and concise description of what the bug is.

And the complete output of the command is given below-

/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")

     _              _             _____   ____  
    / \     _   _  | |_    ___   |___ /  |  _ \ 
   / _ \   | | | | | __|  / _ \    |_ \  | | | |
  / ___ \  | |_| | | |_  | (_) |  ___) | | |_| |
 /_/   \_\  \__,_|  \__|  \___/  |____/  |____/  2.2.8
    // Automatic generation of the low-energy 3D structures

/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")
/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")
Checking input file...
/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")
There are 499 SMILES in the input file /home/sylab02/Auto3D_pkg/standard_smiles_2.smi.
All SMILES and IDs are valid.
Suggestions for choosing isomer_engine and optimizing_engine:
Isomer engine options: RDKit and Omega.
Optimizing engine options: AIMNET.
The available memory is 60 GB.
The task will be divided into 1 jobs.
Job1, number of inputs: 499
/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")
/home/sylab02/Auto3D_pkg/auto3D.py:166: SyntaxWarning: invalid escape sequence '\ '
""")

Isomer generation for job1
Enumerating cis/tran isomers for unspecified double bonds...
Enumerating R/S isomers for unspecified atomic centers...
Removing enantiomers...
Stereo centers for 320 are not fully enumerated.
Stereo centers for 382 are not fully enumerated.
Enantiomers not removed for 54
Enumerating conformers/rotamers, removing duplicates...
100%|██████████████████████████████████| 13720/13720 [14:45:56<00:00, 3.87s/it]

Optimizing on job1
Preparing for parallel optimizing... (Max optimization steps: 10000)
Total 3D conformers: 7502
0%| | 0/10000 [00:00<?, ?it/s]
Process Process-5:
Traceback (most recent call last):
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/auto3D.py", line 142, in optim_rank_wrapper
optimizer.run()
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/batch_opt/batchopt.py", line 426, in run
optdict = ensemble_opt(ani, coord_padded, numbers_padded, charges,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/batch_opt/batchopt.py", line 323, in ensemble_opt
n_steps(state, param['opt_steps'], param['opttol'], param['patience'])
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/batch_opt/batchopt.py", line 244, in n_steps
e, f = state['nn'].forward_batched(coord, numbers,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/batch_opt/batchopt.py", line 185, in forward_batched
_e, _f = self(coord[batch], numbers[batch], charges[batch])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/batch_opt/batchopt.py", line 146, in forward
d = self.ani(
^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sylab02/miniconda3/envs/auto3D/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/ensemble/___torch_mangle_4.py", line 27, in forward
else:
pass
_out = (_0).forward(_in, )
~~~~~~~~~~~ <--- HERE
_r = annotate(Dict[str, Tensor], {})
_6 = torch.keys(_out)
File "code/torch/aimnet/modules.py", line 20, in forward
1 = torch.requires_grad(data[x])
module = self.module
data0 = (module).forward(data, )
~~~~~~~~~~~~~~~ <--- HERE
multipass_module = self.multipass_module
if multipass_module:
File "code/torch/aimnet/models/aimnet2.py", line 30, in forward
_out0 = (self)._zero_padded(data1, _out, )
data2 = (self)._update_q(data1, _out0, False, )
_3 = [(self)._prepare_in_a(data2, ), (self)._prepare_in_q(data2, )]
~~~~~~~~~~~~~~~~~~~ <--- HERE
_in0 = torch.cat(_3, -1)
_out1 = (_1).forward(_in0, )
File "code/torch/aimnet/models/aimnet2.py", line 108, in _prepare_in_a
a_i0 = a_i
conv_a = self.conv_a
avf_a = (conv_a).forward(a_j, data["gs"], data["gv"], )
~~~~~~~~~~~~~~~ <--- HERE
return torch.cat([a_i0, avf_a], -1)
def _prepare_in_q(self: torch.aimnet.models.aimnet2.AIMNet2,
File "code/torch/aimnet/aev.py", line 103, in forward
d2features0 = self.d2features
if d2features0:
avf_v0 = torch.einsum("...nmgd,...mag,agh->...nahd", [gv2, a, agh])
~~~~~~~~~~~~ <--- HERE
avf_v = avf_v0
else:

Traceback of TorchScript, original code (most recent call last):
File "/data/roman/AIMNet2Paper/models/ensemble.py", line 22, in forward
if k in self.x:
_in[k] = data[k]
_out = model(_in)
~~~~~ <--- HERE
_r = dict()
for k in out:
File "/home/roman/repo/aimnet2/aimnet/modules.py", line 252, in forward
torch.set_grad_enabled(True)
data[self.x].requires_grad(True)
data = self.module(data)
~~~~~~~~~~~ <--- HERE
if self.multipass_module:
y = data[self.y][self.ipass]
File "/home/roman/repo/aimnet2/aimnet/models/aimnet2.py", line 130, in forward
_in = self._prepare_in_a(data)
else:
_in = torch.cat([self._prepare_in_a(data), self._prepare_in_q(data)], dim=-1)
~~~~~~~~~~~~~~~~~~ <--- HERE

        _out = mlp(_in)

File "/home/roman/repo/aimnet2/aimnet/models/aimnet2.py", line 87, in _prepare_in_a
if self.d2features:
a_i = a_i.flatten(-2, -1)
avf_a = self.conv_a(a_j, data['gs'], data['gv'])
~~~~~~~~~~~ <--- HERE
_in = torch.cat([a_i, avf_a], dim=-1)
return _in
File "/home/roman/repo/aimnet2/aimnet/aev.py", line 131, in forward
agh = self.agh
if self.d2features:
avf_v = torch.einsum('...nmgd,...mag,agh->...nahd', gv, a, agh)
~~~~~~~~~~~~ <--- HERE
else:
avf_v = torch.einsum('...nmgd,...ma,agh->...nahd', gv, a, agh)
RuntimeError: CUDA out of memory. Tried to allocate 4.27 GiB. GPU 1 has a total capacty of 15.74 GiB of which 11.88 MiB is free. Including non-PyTorch memory, this process has 15.71 GiB memory in use. Of the allocated memory 11.76 GiB is allocated by PyTorch, and 3.81 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The optimization engine did not run, or no 3D structure converged.
The reason might be one of the following:
1. Allocated memory is not enough;
2. The input SMILES encodes invalid chemical structures;
3. Patience is too small
Process Process-3:

System information:

Operating System: Linux ubuntu20.04 64-bit.
Auto3D version: 2.2.8
Python version: 3.12.1
RDKit version: 2023.09.5
PyTorch version: 2.1.2

Header in .smi file causes a type error - add a check for header?

In order to create a .smi file, I used RDKit's PandasTools features. This was the first search result for me when searching "rdkit create smi file". The SaveSMILESfromFrame does exactly what it says, but it includes a header row with 'SMILES' and 'ID'.

When I subsequently tried to run Auto3D on the .smi file, it got the following error:

Checking input file...
        There are 1652 SMILES in the input file data/to_conform.smi. 
        All SMILES and IDs are valid.
[12:37:05] SMILES Parse Error: syntax error while parsing: SMILES
[12:37:05] SMILES Parse Error: Failed parsing SMILES 'SMILES' for input: 'SMILES'

Deleting the header row led to a successful run of Auto3D.

However, if I try to create a pipeline where I first load a CSV of SMILES and other data into a dataframe, then create a temporary .smi file to use with Auto3D, I'm going to run into this problem every time.

It does not appear that RDKit has an option to not include the header. Since that is also the first search result, I suspect several users may try to use the same approach I did.

So I have two questions:

Is there some other preferred way to generate the .smi file that avoids this issue? I suspect the developers did not have this issue, otherwise there would be a header row check.
As a solution, does it make sense to add a header row check from the point in the link below?

Auto3D_pkg/src/Auto3D/utils.py

Line 93 in f463e4f

for line in data:

Apply the get_most_stable_tautomer function to pandas dataframe (from csv)

Hi,

I am wondering if there is a straightforward way to use the get_most_stable_tautomer function as a standalone rdkit function that can be applied to a SMILES column in a pandas dataframe, for example:

df['most_stable_taut'] = df['smiles'].apply(get_most_stable_tautomer)

It seems the current use of the function requires a strict input file format (smiles / sdf), and i cannot find a typical use case to satisfy what I am looking for.

Any clue ?

Thanks,
M

What is the unit of the AIMNet predicted energies ?

Describe the bug
I am trying to use AIMNet to optimize structure of generated conformers for some (potentially charged) molecules. Optimization is working fine but the energy values seem off. The energy is being saved in the sdf file, for example when I use the calc_spe from SPE.py to calculate energy from a small molecule, the outputs energy in eV and saved in the output file in Hartree. But eV values are around -10000, making Hartree value around -370.
Is there a problem with my implementation (I am using the aimnet2nqed_pc14iall_b97m_sae.jpt version) or is the AIMNet output a different unit ?

To Reproduce
Steps to reproduce the behavior:
Load the given sdf (rename the file because I was unable to upload a .sdf) to_opt_AIMNET_E.txt
Run it through the calc_spe function in SPE.py

Expected behavior
I except to see energy in eV which is reasonable (e.g. close to 0 since I estimated my molecule to have energy around -10 kcal/mol)

System information:

Operating System: Ubuntu 20.04
Auto3D version: 2.0
Python version: 3.11.3
RDKit version: 2022.09.5
Pytorch version: 2.0

question : why do you try to amend the SMILES of the Diastereomers generated by rdkit ?

Hello,

I have a question : inside the class rd_isomer in the isomer_engine.py you go to great length to amend the Diastereomers's SMILES using the amend_configuration_w function. Is rdkit has trouble generating all possible diastereomer ?

Best,
Etienne Reboul

aligner = pybel.ob.OBAlign() AttributeError: module 'openbabel.openbabel' has no attribute 'OBAlig

Dear author

When I run the program, a bug occurred, how I solve this problem "

aligner = pybel.ob.OBAlign()
AttributeError: module 'openbabel.openbabel' has no attribute 'OBAlig"
“.

Thanks for your help

Maximum SMILES length?

Hey, thanks for the library! I was hoping to find out if there’s a maximum SMILES string length that’s accepted by the model. Or any observed length after which the quality of predictions degrades.

Thank you!

Error when running tutorial

Hi! I cannot get result file when run the tutorial.ipynb with followed error. Can you help me, thank you!
Isomer generation for job1
Enumerating cis/tran isomers for unspecified double bonds...
Enumerating R/S isomers for unspecified atomic centers...
Removing enantiomers...
Enumerating conformers/rotamers, removing duplicates...
100%|███████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 110.89it/s]
Process Process-4:
Traceback (most recent call last):
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\site-packages\Auto3D\auto3D.py", line 114, in isomer_wraper
engine.run()
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\site-packages\Auto3D\isomer_engine.py", line 239, in run
self.combine_SDF(self.rdk_tmp, self.enumerated_sdf)
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\site-packages\Auto3D\isomer_engine.py", line 192, in combine_SDF
mols = pybel.readfile('sdf', file)
File "C:\Users\yuanrong.fan\AppData\Local\miniconda3\envs\auto3D\lib\openbabel\pybel.py", line 159, in readfile
raise ValueError("%s is not a recognised Open Babel format" % format)
ValueError: sdf is not a recognised Open Babel format

NameError: name 'ANI2xt' is not defined

When I try to run the tautomers example notebook, I get the exception below. Is there a workaround?

Traceback (most recent call last):
  File "/home/pwalters/anaconda3/envs/rdkit_2023_03/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/pwalters/anaconda3/envs/rdkit_2023_03/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pwalters/anaconda3/envs/rdkit_2023_03/lib/python3.10/site-packages/Auto3D/auto3D.py", line 140, in optim_rank_wrapper
    optimizer = optimizing(enumerated_sdf, optimized_og,
  File "/home/pwalters/anaconda3/envs/rdkit_2023_03/lib/python3.10/site-packages/Auto3D/batch_opt/batchopt.py", line 399, in __init__
    self.ani = ANI2xt(device)
NameError: name 'ANI2xt' is not defined

I don't see a difference in energy for tautomers

I don't see a difference in energy for tautomers. I use this SMILES for sildenafil as input
CCCC1=NN(C2=C1N=C(NC2=O)C3=C(C=CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C sildenafil

auto3D.py correctly generated 3 tautomers
CCCc1c2c(c(=O)[nH]c(n2)c3cc(ccc3OCC)S(=O)(=O)N4CCN(CC4)C)n(n1)C sildenafil@taut1
CCCc1c2c(c(=O)nc([nH]2)c3cc(ccc3OCC)S(=O)(=O)N4CCN(CC4)C)n(n1)C sildenafil@taut2
CCCc1c2c(c(nc(n2)c3cc(ccc3OCC)S(=O)(=O)N4CCN(CC4)C)O)n(n1)C sildenafil@taut3

However, the tautomers have the same energy and "E_rel(kcal/mol)" is 0 for all three tautomers. I don't think this is correct. If we look at the literature sildenafil@taut1 and sildenafil@taut2 are predominant.
https://link.springer.com/article/10.1007/s11224-021-01818-7

I've run several other examples and the tautomers are always equinenergetic. Am I missing something? The commandline I used was
python ~/software/Auto3D_pkg/auto3D.py sildenafil.smi --k=1 --enumerate_tautomer=True

How to generate multiple conformers?

Describe the bug
I have tried setting --k 25 and --max_confs 25 and I still only get a single conformer per a smile.

To Reproduce
Steps to reproduce the behavior:
~/git/Auto3D_pkg/auto3D.py rdrefined.smi --k 25 --max_confs

rdrefined.smi contains all the smiles from PDBbind2020 refined set.
rdrefined.smi.gz

Expected behavior
More than one conformer per smile.

System information:

Operating System: Ubuntu 22.04
Auto3D version: b76f112
Python version: 3.10.6
RDKit version: 2022.09.4
PyTorch version: 1.13.0+cu117

Additional context
Add any other context about the problem here.

single_point_energy.ipynb: No module named 'ase'

Describe the bug
When trying the single_point_energy.ipynb, module not found error (No module named 'ase') was encountered at the line below:
from Auto3D.SPE import calc_spe.

To Reproduce
Steps to reproduce the behavior:
run the single_point_energy.ipynb

System information:

Operating System:
Auto3D version: 2.2.9
Python version:3.9.7
RDKit version:2023.09.4
PyTorch version:2.1.0

SMD solvation energy

Thank you for your open-source code. I have a question about AIMNet model. Are SMD solvation energies included in the model?

Import StandaloneRepulsionCalculator Error

Hi,

Thanks for your excellent work and sharing the source code!

When the script imports "StandaloneRepulsionCalculator" function from torchani, the error occurred.

I don't find the function in torchani repo, could you tell me which version of torchani should be installed?

Many thanks for your help!

Traceback (most recent call last): File "get_ani_2xt.py", line 12, in <module> from Auto3D.batch_opt.ANI2xt import ANI2xt File "/home/xlpan/solv_env_rdkit/lib/python3.6/site-packages/Auto3D/batch_opt/ANI2xt.py", line 6, in <module> from torchani.repulsion import StandaloneRepulsionCalculator

Question: Can you generate just the tautomer and stereoisomer SMILES (no geom opt)?

As the title suggests, I am interested in using auto3d to just print out the smiles of the missing isomers for a SMILE string. Is that possible? Would be a great tutorial notebook!

Thanks.

issues when running from the CLI if username contains '.' (dot)

Describe the bug
Running auto3d from the CLI throws an error if the username has a dot.
To Reproduce
Steps to reproduce the behavior:

create a new user test.user
run python auto3d.py "example/files/smiles.smi" --k=1 --optimizing_engine="ANI2x"
See error: PermissionError: [Errno 13] Permission denied: '/Users/test_reduced.smi'

Expected behavior
Auto3D would run and generate 1st lowest energy conformer

Screenshots
If applicable, add screenshots to help explain your problem.

System information:

Operating System: MacOS
Auto3D version: latest
Python version:
RDKit version:
PyTorch version:

Additional context
Problem comes from how you're defining the smiles_reduced variable (line 50 from auto3d.py):
smiles_reduced = smiles_enumerated.split('.')[0] + '_reduced.smi'
the split is splitting the entire path, taking the first item (the first part of the username in this setting) and attaching the _reduced.smi. You could solve the issue by splitting on 'enumerated.' instead of '.'. But even if this is solved in this line, line 45 also suffers from the same problem (gives the following error : 'OSError: File error: Bad output file /Users/test0.sdf').

Code blocks but doesn't use GPUs

Hi there!

I am running Auto3D on 200 smiles with --use_gpu flag being True and I found that it blocks all the available GPUs I have on the machine, but runs calculations only on one of them:

For the run:

python Auto3D_pkg/auto3D.py tests/input_corrected.smi --k=5 --enumerate_tautomer false --enumerate_isomer false --capacity 1

There is the following output in the log:

The available memory is 32 GB.
The task will be divided into 7 jobs.
Job1, number of inputs: 30
Job2, number of inputs: 30
Job3, number of inputs: 29
Job4, number of inputs: 29
Job5, number of inputs: 29
Job6, number of inputs: 29
Job7, number of inputs: 29

And nvidia-smil output (see processes cocoa_env/bin/python):

So, it only considers one GPU, but blocks 5. Is this is an intendet behavior? Is the code able to parallelize over multiple GPUs? I tried turning --capacity option, but seems to be the same result.

Dataset

I am writing to express my appreciation for your open-source project. It is a valuable tool for the field of drug discovery. I am interested in retraining a model using the dataset you generated in your paper. Could you please advise me on how to access the dataset for download? Thank you for your time and consideration.

Typo in output

Hi there. Little typo in the output print-out: "select"

Optimization finished at step 2343:   Total 3D structures: 2290  Converged: 2245   Dropped(Oscillating): 45    Active: 0
Begin to slelect structures that satisfy the requirements...
Energy unit: Hartree if implicit.

Git tag

Could you add git tags corresponding to the uploaded package on pypi so Auto3D can be packaged on conda-forge?

Also uploading a source package (sdist) would be helpful for the packaging.

Thank you

Range Error

I am trying to generate low energy conformations of a sets of smiles. I am providing smiles as a batch file. However, for some molecules below error appears and code stucks. Does not proceed to next molecule, it halts.

Range Error
idx2
Violation occurred on line 352 in file /home/conda/feedstock_root/build_artifacts/rdkit-meta_1722095823005/work/Code/GraphMol/ROMol.cpp
Failed Expression: 4294967295 < 10

Stacktrace:
0# Invar::Invariant::toStringabi:cxx11 const in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libRDKitRDGeneral.so.1
1# Invar::operator<<(std::ostream&, Invar::Invariant const&) in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libRDKitRDGeneral.so.1
2# RDKit::ROMol::getBondBetweenAtoms(unsigned int, unsigned int) const in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/../../../../libRDKitGraphMol.so.1
3# RDKit::Bond::setStereoAtoms(unsigned int, unsigned int) in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/../../../../libRDKitGraphMol.so.1
4# 0x000075332BB511BA in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
5# boost::python::objects::function::call(_object, _object) const in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libboost_python312.so.1.84.0
6# 0x000075332D3A77E9 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libboost_python312.so.1.84.0
7# boost::python::detail::exception_handler::operator()(boost::function0 const&) const in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libboost_python312.so.1.84.0
8# 0x000075332814CDB4 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdSLNParse.so
9# 0x00007533282FD9F4 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdDepictor.so
10# 0x00007533285B03C4 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdChemReactions.so
11# 0x00007533285B0414 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdChemReactions.so
12# 0x0000753328729474 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdMolChemicalFeatures.so
13# 0x000075332B522744 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdmolfiles.so
14# 0x000075332B522794 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdmolfiles.so
15# 0x00007533C975FBE4 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdCIPLabeler.so
16# 0x000075332BB73194 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
17# 0x000075332BB0850D in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
18# 0x000075332BB081BD in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
19# 0x000075332BB07E6D in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
20# 0x000075332BB07B1D in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
21# 0x000075332BB077CD in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/rdchem.so
22# 0x000075332C890714 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/rdBase.so
23# 0x000075332C890764 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/rdBase.so
24# 0x000075332C8907B4 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/rdBase.so
25# 0x000075332C890804 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/rdBase.so
26# boost::python::handle_exception_impl(boost::function0) in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libboost_python312.so.1.84.0
27# 0x000075332D3A4493 in /media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/../../../libboost_python312.so.1.84.0
28# _PyObject_MakeTpCall in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
29# 0x00005E85781E7F20 in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
30# 0x00005E857832BF3C in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
31# PySequence_Tuple in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
32# PyObject_Vectorcall in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
33# 0x00005E85781E7F20 in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
34# PyEval_EvalCode in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
35# 0x00005E85783BF7BA in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
36# 0x00005E85783BAA9B in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
37# PyRun_StringFlags in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
38# PyRun_SimpleStringFlags in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
39# Py_RunMain in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
40# Py_BytesMain in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12
41# 0x000075345A229D90 in /lib/x86_64-linux-gnu/libc.so.6
42# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6
43# 0x00005E8578383E21 in /media/erol/Backup/anaconda3/envs/auto3D/bin/python3.12

Process Process-4:
Traceback (most recent call last):
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/auto3D.py", line 94, in isomer_wraper
engine.run()
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/isomer_engine.py", line 188, in run
isomers = self.enumerate_func(mol)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/Auto3D/isomer_engine.py", line 145, in enumerate_func
isomers = tuple(EnumerateStereoisomers(mol, options=opts))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/EnumerateStereoisomers.py", line 303, in EnumerateStereoisomers
flippers = _getFlippers(tm, options)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/erol/Backup/anaconda3/envs/auto3D/lib/python3.12/site-packages/rdkit/Chem/EnumerateStereoisomers.py", line 94, in _getFlippers
bnd.SetStereoAtoms(si.controllingAtoms[0], si.controllingAtoms[2])
RuntimeError: Range Error
idx2
Violation occurred on line 352 in file Code/GraphMol/ROMol.cpp
Failed Expression: 4294967295 < 10
RDKIT: 2024.03.5
BOOST: 1_84

Is there any possible explanation what is wrong? and how I can overcome this error?

System information:

Operating System: Ubuntu 22.04
Auto3D version: 2.2.11
Python version: 3.12
RDKit version: 2024.03.5
PyTorch version: 2.3.1

ValueError in hash_taut_smi when read molecules from SDF

Describe the bug
Running the code with sdf file as input and --enumerate_tautomer = True gives an error:

Traceback (most recent call last):
  File "/anaconda3/envs/cocoa_env/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/anaconda3/envs/cocoa_env/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/dev/Cocoa/Auto3D/Auto3D_pkg/src/Auto3D/auto3D.py", line 95, in isomer_wraper
    hash_taut_smi(output_taut, output_taut)
  File "/dev/Cocoa/Auto3D/Auto3D_pkg/src/Auto3D/utils.py", line 380, in hash_taut_smi
    smiles, id = line.strip().split()
ValueError: too many values to unpack (expected 2)

And calculations are getting frozen.

To Reproduce
Steps to reproduce the behavior:
python ../Auto3D_pkg/auto3D.py stereo_test/tautomers_test.sdf --k 5 --enumerate_tautomer True --tauto_engine rdkit

Expected behavior
Generates tautomers if structure is ambiguous.

System information:

Operating System: Centos
Auto3D version: 2.1.0
Python version: 3.10.13
RDKit version: 2023.03.3
PyTorch version: 2.0.0

Error on tautomer example

Describe the bug
The tautomer example gets an error when running on a clean install. This error seems to be originating from RDKit.

To Reproduce
Steps to reproduce the behavior:

Follow the instalation instructions
Install and start jupyterlab
Run the cells of the tautomers examples: /examples/tautomer.ipynb
See error

Expected behavior
The example should run without errors.

Error messages

/examples/tautomer.ipynb

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/var/folders/c1/ly1p3sf14rnc9fpcrzq46q440000gp/T/ipykernel_1014/1582100770.py in <module>
      4                    optimizing_engine="ANI2xt",  #ANI2xt is NNP designed for tautomers
      5                    max_confs=10, patience=200, use_gpu=False)
----> 6     tautomer_out = get_stable_tautomers(args, tauto_k=3)

~/miniconda3/envs/autoed-2/lib/python3.7/site-packages/Auto3D/tautomer.py in get_stable_tautomers(args, tauto_k, tauto_window)
     81     """
     82     out = main(args)
---> 83     out_tautomer = select_tautomers(out, tauto_k, tauto_window)
     84     return out_tautomer

~/miniconda3/envs/autoed-2/lib/python3.7/site-packages/Auto3D/tautomer.py in select_tautomers(sdf, k, window)
     63     basename = os.path.basename(sdf).split(".")[0].strip() + "_top_tautomers.sdf"
     64     output_path = os.path.join(folder, basename)
---> 65     with Chem.SDWriter(output_path) as w:
     66         for mol in results:
     67             w.write(mol)

AttributeError: __enter__

System information:

Operating System: macOS Catalina 10.15.7 (19H2026)
Auto3D version: 2.0
Python version: 3.7.16
RDKit version: 2020.09.1.0
PyTorch version: 1.13.1

Add entry point for CLI in auto3D.py

Hi,

I think that the current usage of the CLI entry point could be improved using python package entry points.

Usage would change from:

cd <replace with your path_folder_with_auto3D.py>
python auto3D.py "example/files/smiles.smi" --k=1

auto3D my-smiles-not-from-code.smi --k=1

which is much better if installing from PyPI and not cloning the source.

I can make a PR for it, just gauging your interest first.

Auto3D on Google Colab

I tried to run Auto3D on Colab (with GPU) but it return nothing and I couldn't find where was the problem.
It gives me a log file that reads

         _              _             _____   ____  
        / \     _   _  | |_    ___   |___ /  |  _ \ 
       / _ \   | | | | | __|  / _ \    |_ \  | | | |
      / ___ \  | |_| | | |_  | (_) |  ___) | | |_| |
     /_/   \_\  \__,_|  \__|  \___/  |____/  |____/  2.1.0
            // Automatic generation of the low-energy 3D structures

================================================================================
INPUT PARAMETERS

path: ligands/smiles.smi
k: 1
window: False
verbose: False
job_name: 20230917-083135-620261
enumerate_tautomer: False
tauto_engine: rdkit
pKaNorm: True
isomer_engine: rdkit
enumerate_isomer: True
mode_oe: classic
mpi_np: 4
max_confs: None
use_gpu: True
capacity: 42
gpu_idx: 0
optimizing_engine: AIMNET
patience: 1000
opt_steps: 5000
convergence_threshold: 0.003
threshold: 0.3
memory: None
batchsize_atoms: 1024
input_format: smi

                           RUNNING PROCESS

================================================================================
Checking input file...
'''
_ _ _____ ____
/ \ _ _ | |_ ___ |___ / | _ \
/ _ \ | | | | | | / _ \ | \ | | | |
/ ___ \ | || | | | | () | ) | | || |
// _\ _,| __| __/ |/ |/ 2.1.0
// Automatic generation of the low-energy 3D structures

================================================================================
INPUT PARAMETERS

path: ligands/smiles.smi
k: 1
window: False
verbose: False
job_name: 20230917-083138-867542
enumerate_tautomer: False
tauto_engine: rdkit
pKaNorm: True
isomer_engine: rdkit
enumerate_isomer: True
mode_oe: classic
mpi_np: 4
max_confs: None
use_gpu: True
capacity: 42
gpu_idx: 0
optimizing_engine: AIMNET
patience: 1000
opt_steps: 5000
convergence_threshold: 0.003
threshold: 0.3
memory: None
batchsize_atoms: 1024
input_format: smi

                           RUNNING PROCESS

================================================================================
Checking input file...

and the cell finishes running after few seconds as everything was completed as normal.
I could not find any _3d.sdf file in the working directory.

Add pyyaml to installation instructions

It doesn't look like the installation instructions include pyyaml

pip install pyyaml

run comand not work "python .\auto3d.py --path="data/smiles.sdf" --use_gpu=False"

Testing Tutorial notebook, very slow

Hello!

I'm working on M2 Mac and I had no issue with installation.
Currently, testing your tutorial notebooks optimising the 4 smiles from the smiles.smi file provided.

Upon running these lines:

if __name__ == "__main__":
    path = os.path.join(root, "example/files/smiles.smi")
    args = options(path, k=1, use_gpu=False)   #specify the parameters for Auto3D 
    out = main(args)            #main acceps the parameters and run Auto3D
    print(out)

The notebook mentions taking it less than 1 minute. In my case is quite slow and seems to take multiple minutes. Am I the only one that has these issue?

Many thanks

Prune multiple conformers with a rmsd cutoff

Does it support pruning conformers unless their RMSD is higher than a cutoff in rmsd?

tauto_k in get_stable_tautomers

Describe the bug
The tauto_k is said to output the top-k tautomers for each SMILES. When testing on the tutorial example sildnafil.smi (and other molecules), changing tauto_k in get_stable_tautomers function doesn't seem to change the number of output tautomers.

To Reproduce
Steps to reproduce the behavior:

Go to the tautomer tutorial (example/tautomer.ipynb)
set tauto_k=3 to see it generates 3 tautomers
set tauto_k=2 to see it still generates 3 tautomers

System information:

Operating System:
Auto3D version: 2.2.9
Python version:3.9.7
RDKit version:2023.09.4
PyTorch version:2.1.0

Additional context
Add any other context about the problem here.

Tag and Release GitHub package?

If there's a new 2.3.0 release on PyPI, can someone please create a release tag on GitHub? This is important for packaging (conda-forge, for example cannot find the new release).

It's also useful since the current release listed https://github.com/isayevlab/Auto3D_pkg/releases is 2.2.7.

TorchANI with repulsion calculator is not installed.

Hi
I was trying to replicate tautomer generation using the following command
python auto3D.py test.smi --k=1 --enumerate_tautomer=True --verbose=True --optimizing_engine=ANI2xt --job_name=rdkit-ani3
I end up getting the following error
ANI2xt is used as optimizing engine, but TorchANI with repulsion calculator is not installed.

The following command works fine for me
python auto3D.py test.smi --k=1 --enumerate_tautomer=True --verbose=True --optimizing_engine=AIMNET --job_name=rdkit-aimnet

Mrinal

Question about error "optimization engine did not run, or no 3D structure converged."

Hi, I am looking for advice. I am trying to optimize some molecules and some of them are resulting in errors, where the optimization engine does not run. This is the error I received:

The optimization engine did not run, or no 3D structure converged.
The reason might be one of the following:
1. Allocated memory is not enough;
2. The input SMILES encodes valid chemical structures;
3. Patience is too small

I tried increasing the patience to 5000 but it did not help. The log files show I have 40 GB of memory available so I don't think memory is the issue. I don't know what suggestion 2 means: "The input SMILES encodes valid chemical structures".

This is the input commands I use: python auto3D.py AOIC.smi --k=1 --patience=5000
Below is an example of one of the SMILES I am working with that is causing this error. Any advice would be appreciated.

CCc1ccc(C2(c3ccc(CC)cc3)c3cc4c(cc3-c3sc(/C=C5\C(=O)c6cc(F)c(F)cc6C5=C(C#N)C#N)cc32)C(c2ccc(CC)cc2)(c2ccc(CC)cc2)c2c-4sc3c2C(c2ccc(CC)cc2)(c2ccc(CC)cc2)c2c-3sc3cc(/C=C4\C(=O)c5cc(F)c(F)cc5C4=C(C#N)C#N)sc23)cc1 AOIC

Dataset Generation

Thank you for your open-source code of Auto3D, it's a useful tool for drug discovery.

I'm a new researcher for neural network potential development and have some questions regarding dataset generation. You described the workflow for the nonequilibrium conformation sampling by DFN2-xTB and energy calculation by B97-3c.

"We carried out the nonequilibrium conformation generation process using GFN2-XTB molecular dynamics at 400 K for 20 ps. The optimized structures were selected for energy calculations with the B97-3c composite scheme in ORCA."

Does the non-equilibrium conformation sampled from the simulation trajectory need to optimize geometry by B97-c before calculating the single-point energy? Or directly calculate the single point energy using non-equilibrium conformations sampled from trajectories?

If the geometric optimization is applied before the single-point energy calculation, do constraints need to be added during geometric optimization? For example, dihedral constraints.

Looking forward to your reply. Thank you very much.

isayevlab / auto3d_pkg Goto Github PK

auto3d_pkg's People

Contributors

Stargazers

Watchers

Forkers

auto3d_pkg's Issues

Range Error idx2 Violation occurred on line 352 in file /home/conda/feedstock_root/build_artifacts/rdkit-meta_1722095823005/work/Code/GraphMol/ROMol.cpp Failed Expression: 4294967295 < 10

================================================================================ INPUT PARAMETERS

================================================================================ INPUT PARAMETERS

Recommend Projects

Recommend Topics

Recommend Org

Range Error
idx2
Violation occurred on line 352 in file /home/conda/feedstock_root/build_artifacts/rdkit-meta_1722095823005/work/Code/GraphMol/ROMol.cpp
Failed Expression: 4294967295 < 10

================================================================================
INPUT PARAMETERS

================================================================================
INPUT PARAMETERS