Giter Club home page Giter Club logo

teachopencadd's People

Contributors

aariam avatar ajk-dev avatar alex-treebeard avatar andreavolkamer avatar andy-wilkinson avatar azmtag avatar corey-taylor avatar dominiquesydow avatar f-sod avatar gerritgr avatar greglandrum avatar hamzaibrahim21 avatar jaimergp avatar jaketanderson avatar joschka-gross avatar mbackenkoehler avatar mika-le avatar morgeral avatar old-shatterhand avatar paulakramer avatar rflameiro avatar rhjohnstone avatar richardjgowers avatar sakshimisra avatar schallerdavid avatar speleo3 avatar t-kimber avatar telomelonia avatar yonghui-cc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

teachopencadd's Issues

PLIP installation error for T016 Protein Ligand Interactions

I have openbabel installed but still i get this error message on my cmd prompt:

Collecting plip
Using cached plip-2.2.2-py3-none-any.whl (93 kB)
Requirement already satisfied: numpy in c:\users\que\anaconda3\envs\introductiontopython\lib\site-packages (from plip) (1.21.0)
Collecting lxml
Using cached lxml-4.6.3-cp38-cp38-win_amd64.whl (3.5 MB)
Collecting openbabel
Using cached openbabel-3.1.1.1.tar.gz (82 kB)
Building wheels for collected packages: openbabel
Building wheel for openbabel (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'C:\Users\QUE\anaconda3\envs\introductiontopython\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\QUE\AppData\Local\Temp\pip-install-eovmndz7\openbabel_265f2b5a89f44697b473157c7eb430db\setup.py'"'"'; file='"'"'C:\Users\QUE\AppData\Local\Temp\pip-install-eovmndz7\openbabel_265f2b5a89f44697b473157c7eb430db\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\QUE\AppData\Local\Temp\pip-wheel-qwam7u83'
cwd: C:\Users\QUE\AppData\Local\Temp\pip-install-eovmndz7\openbabel_265f2b5a89f44697b473157c7eb430db
Complete output (15 lines):
running bdist_wheel
running build
running build_ext
Warning: invalid version number '3.1.1.1'.
Guessing Open Babel location:

  • include_dirs: ['C:\Users\QUE\anaconda3\envs\introductiontopython\include', 'C:\Users\QUE\anaconda3\envs\introductiontopython\include', '/usr/local/include/openbabel3']
  • library_dirs: ['C:\Users\QUE\anaconda3\envs\introductiontopython\libs', 'C:\Users\QUE\anaconda3\envs\introductiontopython\PCbuild\amd64', '/usr/local/lib']
    building 'openbabel._openbabel' extension
    swigging openbabel\openbabel-python.i to openbabel\openbabel-python_wrap.cpp
    swig.exe -python -c++ -small -O -templatereduce -naturalvar -IC:\Users\QUE\anaconda3\envs\introductiontopython\include -IC:\Users\QUE\anaconda3\envs\introductiontopython\include -I/usr/local/include/openbabel3 -o openbabel\openbabel-python_wrap.cpp openbabel\openbabel-python.i

Error: SWIG failed. Is Open Babel installed?
You may need to manually specify the location of Open Babel include and library directories. For example:
python setup.py build_ext -I/usr/local/include/openbabel3 -L/usr/local/lib
python setup.py install

I tried to gitclone plip and then open on jupyter notebook. I ended up with this error :(

ModuleNotFoundError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_5344/3038562561.py in
9 import matplotlib.pyplot as plt
10 from matplotlib import colors
---> 11 from plip.structure.preparation import PDBComplex
12 from plip.exchange.report import BindingSiteReport
13

ModuleNotFoundError: No module named 'plip.structure'

I have tried searching for a fix online but i couldn't really get much.

T019: No openmmforcefields under Windows

T019 does not run under Windows.

Our CI fails with: ModuleNotFoundError: No module named 'openff'

Why?

- openmm
# depends on openff-toolkit->ambertools -> not available on Windows yet!
# - openmmforcefields

See also here: #74 (comment)

openmmforcefields may cause the problem, check progress of this issue: openmm/openmmforcefields#163

Do we notify users?

Yes, talktorial contains disclaimer, linking back to this issue:

Also, note that this talktorial will not run on Windows for the time being (check progress in this issue).

What do we do regarding our CI?

Remove the T019 from the Windows CI.

Add branding header to each notebook

We might benefit from adding a standardized, custom header to each notebook to reflect the origin of the project and how to contribute back, help us develop this further and so on. I don't have any specific idea on the looks, but the Markdown cells will allow us to do anything in HTML so I think we have plenty of creative freedom there.

Some ideas:

  • Star the project widget
  • Add the logo
  • Links to repo / documentation
  • Citation instructions
  • Links to other interesting software projects (ours, mainly)

Is there a way to get a common cut-off value?

"T007 · Ligand-based screening: machine learning" was very helpful for us. Thanks for sharing this article.

I just have one question.

We are using IC50 data targeting other cells of ChEMBL, but our data does not have a few comments to divide "active" and "inactive".

Therefore, we also have to divide based on the pIC50 value, and I would like to know how to get the cut-off of 6.3 in the article you shared above.

I will wait for your reply.

Thank you again.

  • Kim Hyeon Ki

RDKit and pypdb are pinned to old releases - update relevant notebooks

Last nightly contained a change in the Python 3.9 branch: T002 is reporting a different subset of compounds (just one entry, though).

https://github.com/volkamerlab/teachopencadd/runs/1461859605?check_suite_focus=true#step:8:47

Might be temporary (if the database was changing, it should affect all branches), so we'll wait until tomorrow to see if that's the case.

This nightly also reported a bug in RDKit (py39), in the cairo code to export images. Might be related to the same error as above (Image has changed API?). We'll wait too.

T5: clusters are not sorted by size by default

Cell 17:

print ('Ten molecules from second largest cluster:')
# Draw molecules
Draw.MolsToGridImage([mols[i][0] for i in clusters[1][:10]], 
                     legends=[mols[i][1] for i in clusters[1][:10]], 
                     molsPerRow=5)

However, the clusters returned by Butina.ClusterData(distance_matr,len(fps),cutoff,isDistData=True) are not sorted by default, i.e. we cannot guarantee that clusters[1] is indeed the second largest cluster.

In the talktorial, it does happen that (at least) the first two clusters are correctly ordered, but when I was using a different original target, the second cluster only had one element, while others had more. Anyway, this is easily checked by just listing the lens of the clusters. Moreover, the docs do not claim that they are ordered.

T008: PDB API for chemicals changed

The PDB API changed again and may not be updated in biotite.

Talktorial T008 throws the following error (cell 9):

RequestError: Error 400: Invalid request to the [ text ] service: search is not enabled on [ chem_comp.formula_weight ] attribute

As far as I understand they split the Search Service from text into text and text_chem:
https://search.rcsb.org/#search-services

They split the search attribute web pages (needs updating in the talktorial as well)

The old API (I think):

{
  "query": {
    "type": "terminal",
    "service": "text",
        "parameters": {
          "attribute": "chem_comp.formula_weight",
          "operator": "greater",
          "value": 100
        }
  },
  "return_type": "entry"
}

The new API

{
  "query": {
    "type": "terminal",
    "service": "text_chem",
        "parameters": {
          "attribute": "chem_comp.formula_weight",
          "operator": "greater",
          "value": 100
        }
  },
  "return_type": "entry"
}

Talktorial template

How to start

Replace example names as needed:

git clone https://github.com/volkamerlab/teachopencadd.git
cd teachopencadd/
# or... git pull on master
git checkout -b ab-099-title  # ab = your initials; 099 = your talktorial index; title = short talktorial title 
git commit --allow-empty -m "Start branch"
git push --set-upstream origin ab-099-title

# Now go to the suggested URL and create the PR using the suggested template
# If the template is not there, copy it from this issue: https://github.com/volkamerlab/TeachOpenCADD/issues/41

# Back to the CLI to set up the environment
conda env create -f devtools/conda-envs/test_env.yml
conda activate teachopencadd
jupyter labextension install @ijmbarr/jupyterlab_spellchecker
jupyter lab

Now you can start working on your talktorial!

Render the website locally

Once your talktorial ready, check how it renders in HTML.

cd docs
make html
cp -r talktorials/images/ _build/html/talktorials/images
cd _build/html
# open index.html on your browser:
xdg-open index.html
# or under windows: explorer.exe index.html
# or under MacOS: open index.html

If your notebook does not appear, you need to add the nblink forwarder in docs/notebooks. Copy paste an existing ones and update paths accordingly!

PR template

Create a PR using this template. Ping us as reviewer once you'd like feedback on the talktorial.

https://github.com/volkamerlab/teachopencadd/blob/master/.github/PULL_REQUEST_TEMPLATE/talktorial_review.md

Jupyter Lab 3

This is going to be released soon, which enables dynamic extensions. This will allow us to provide a package that works right away without having to worry about jupyter labextension install bla bla bla. Leaving this here so I don't forget to update the environment and recipe :)

T1: Fetch data by ChEMBL version?

Whenever a new ChEMBL version is released, the compound/bioactivity dataset fetched in T1 will change and affect all downstream notebooks.

When doing our packaging/refactoring of all notebooks, it would be great to find out how to fetch data by ChEMBL version (e.g. frozen to the ChEMBL version/dataset shown in our TeachOpenCADD publication).

Get help:

Anaconda versions?

The main README.md refers to testing with anaconda2 (python 2.x?) but the environment.yml requires python > 3.6 if I'm reading it correctly. Maybe the README is out of date?

In this day and age I'd highly recommend making sure everything uses python 3.

T008: Align Complexes doesn't work

I am also having issues with this line could this be checked into? i tried using jupyter notebooks and google colab but still couldn't get around it

results = align(complexes, method=METHODS["mda"])

this is the error i keep getting when i run the code

AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_2872/2813988720.py in
----> 1 results = align(complexes, method=METHODS["mda"])

~\anaconda3\envs\introductiontopython\lib\site-packages\opencadd\structure\superposition\api.py in align(structures, method, **kwargs)
40 results = []
41 for mobile in mobiles:
---> 42 result = aligner.calculate([reference, mobile])
43 results.append(result)
44

~\anaconda3\envs\introductiontopython\lib\site-packages\opencadd\structure\superposition\engines\base.py in calculate(self, structures, *args, **kwargs)
29 """
30 assert len(structures) == 2
---> 31 return self._calculate(structures, *args, **kwargs)
32
33 def _calculate(self, structures, *args, **kwargs):

~\anaconda3\envs\introductiontopython\lib\site-packages\opencadd\structure\superposition\engines\mda.py in _calculate(self, structures, *args, **kwargs)
117
118 # Get matching atoms
--> 119 selection, alignment = self.matching_selection(*structures)
120 ref_atoms = ref_universe.select_atoms(selection["reference"])
121 mobile_atoms = mob_universe.select_atoms(selection["mobile"])

~\anaconda3\envs\introductiontopython\lib\site-packages\opencadd\structure\superposition\engines\mda.py in matching_selection(self, reference, mobile)
176 fasta["ref"], fasta["mob"], *_empty = alignment.get_gapped_sequences()
177 fasta.write("temp.fasta")
--> 178 selection = fasta2select(
179 "temp.fasta",
180 ref_resids=ref_resids,

~\anaconda3\envs\introductiontopython\lib\site-packages\opencadd\structure\superposition\sequences.py in fasta2select(fastafilename, ref_resids, ref_segids, target_resids, target_segids, backbone_selection)
115
116 """
--> 117 protein_gapped = Bio.Alphabet.Gapped(Bio.Alphabet.IUPAC.protein)
118 with open(fastafilename) as fasta:
119 alignment = Bio.AlignIO.read(fasta, "fasta", alphabet=protein_gapped)

AttributeError: module 'Bio' has no attribute 'Alphabet'

Make repo less heavy?

Repo is quite big already (0.5 GB). When someone can allocate time, look into options to make it less heavy.

Issue on T009 · Ligand-based pharmacophores

The command feature_factory = AllChem.BuildFeatureFactory(str(Path(RDConfig.RDDataDir) / "BaseFeatures.fdef")) is not working. It gives the error message:

OSError                                   Traceback (most recent call last)
<ipython-input-17-9f1fb722f467> in <module>()
----> 1 feature_factory = AllChem.BuildFeatureFactory(str(Path(RDConfig.RDDataDir) / "BaseFeatures.fdef"))

OSError: File: /opt/anaconda1anaconda2anaconda3/share/RDKit/Data/BaseFeatures.fdef could not be opened.

It solved with the command feature_factory = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')
but the following command also gives an error message:

AttributeError                            Traceback (most recent call last)
<ipython-input-21-3c1dd36efe52> in <module>()
----> 1 features = feature_factory.GetFeaturesForMol(str(example_molecule))
      2 print(f"Number of features found: {len(features)}")

AttributeError: 'str' object has no attribute 'GetFeaturesForMol'

The command list(feature_factory.GetFeatureDefs().keys()) also gives the error message:

AttributeError                            Traceback (most recent call last)
<ipython-input-18-d24aee5309ac> in <module>()
----> 1 list(feature_factory.GetFeatureDefs().keys())

AttributeError: 'str' object has no attribute 'GetFeatureDefs'

Please, help me solving these issues. Thanks.

Update `pypdb`-related notebooks T008 and T010

pypdb-related notebooks must be updated since the old API does not work any more due to changes in the RCSB API.

From pypdb's GH page:

As of November 2020, pypdb is undergoing significantly refactoring in order to accomodate changes to the RCSB PDB API and extend functionality. We regret any breaking changes that occur along the way. The previous version of pypdb is available here; however, it will no longer function to to the RCSB API being

Check out the new API:
https://github.com/williamgilpin/pypdb/blob/master/demos/demos.ipynb


Example issue in T010:

Input:

# Set up query dictionary
search_dict = pypdb.make_query("STI")
search_dict

Output:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-521e3a31d168> in <module>
      1 # Set up query dictionary
----> 2 search_dict = pypdb.make_query("STI")
      3 search_dict
AttributeError: module 'pypdb' has no attribute 'make_query'

T015_protein_ligand_docking Using OPAL Attribute error: Nonetype object has no attribute atttrs

When i run the code below:

%time result = step_03_opal(PROTEIN, smiles[:1], COMPLEX)

i get the following error:
AttributeError Traceback (most recent call last)
in

~\AppData\Local\Temp/ipykernel_6344/2733412323.py in step_03_opal(protein, smiles, pdbcomplex)
14 """
15 prepared_protein = opal_prepare_protein(protein)
---> 16 center, radius = dogsite_scorer_guess_binding_site(pdbcomplex)
17 size = [radius] * 3 # Vina supports non-cubic boxes, but we will use a cube for simplicity
18 for i, smile in enumerate(smiles):

~\AppData\Local\Temp/ipykernel_6344/3358257068.py in dogsite_scorer_guess_binding_site(protein)
131 job_location = dogsite_scorer_submit_with_pdbid(protein)
132 elif protein.endswith(".pdb"):
--> 133 job_location = dogsite_scorer_submit_with_custom_pdb(protein)
134 else:
135 raise ValueError("protein must be a PDB ID or a path to a .pdb file!")

~\AppData\Local\Temp/ipykernel_6344/3358257068.py in dogsite_scorer_submit_with_custom_pdb(pdbfile)
78 # 2. Get internal location id
79 html = BeautifulSoup(r.text)
---> 80 pdb_id = html.find("input", {"name": "dogsite[pdbCode]"}).attrs["value"]
81
82 # 3. Get the internal job ID

AttributeError: 'NoneType' object has no attribute 'attrs'

I hope you don't mind my many issues. Im really trying to understand and see how i can make maximum use of these talktorials in my current research work.
Thansk and cheers for the good work! @dominiquesydow and @jaimergp and the whole Volkamerlab team

Notebook Ligand_based_pharmacophores

The notebook T009_Ligand_based_pharmacophores is not working. The cell [5] has a list molecules but it shuld be the list molecule.
However, if I correct this issue, the mol_file could not be read because the type is None.

Do you know if there is another issue in this notebook?

T016: Nitrogen is recognized differently on Windows vs. Unix

CI error message

[gw0] win32 -- Python 3.6.13 C:\Miniconda\envs\teachopencadd\python.exe
Notebook cell execution failed
Cell 15: Cell outputs differ

Input:
create_df_from_binding_site(interactions_by_site[selected_site], interaction_type="hbond")
# NBVAL_CHECK_OUTPUT

Traceback:
 mismatch 'text/html'

 assert reference_output == test_output failed:

  '<div>\n<styl...able>\n</div>' == '<div>\n<styl...able>\n</div>'
  Skipping 1084 identical leading characters in diff, use -v to show
  -      <td>N2</td>
  ?           ^
  +      <td>Nar</td>
  ?           ^^
          <td>(13.371, 34.064, 15.005)</td>
          <td>(10.667, 33.654, 16.145)</td>
        </tr>
      </tbody>
    </table>
    </div>

Why?

T016: it fails due to a diff issue. A nitrogen is recognized differently on Windows than Unix (?). Might have to do with different versions installed.

From #74 (comment)

OpenBabel is used to identify hydrogen bond donor and acceptor atoms. Halogen atoms are excluded from this group and treated separately (see below).

From https://plip-tool.biotec.tu-dresden.de/plip-web/plip/help

Atom types in OpenBabel:

EXTTYP  [n]         Nar
EXTTYP  [$(N=*)]        N2

From https://github.com/openbabel/openbabel/blob/master/data/atomtyp.txt

T001: Frozen bioactivity dataset is not frozen

In T001, we thought we froze the bioactivity dataset (by checking for activity IDs in ChEMBL 27) but it seems not to work.

Version on master branch:
Number of bioactivities queried for EGFR in this notebook: 7178
Number of bioactivities after ChEMBL 27 intersection: 7178

Running this notebook today:
Number of bioactivities queried for EGFR in this notebook: 8817
Number of bioactivities after ChEMBL 27 intersection: 8031 (I would expect 7178)

@jaimergp, I am sorry to bother you with this.
Do you understand why our intersection with the chembl27_activities.npz.zip does not produce stable results?

If not: Since I do not have the time to debug this (and you probably neither), my suggestion is to remove the chembl27_activities.npz.zip freezing bit --- and instead freeze the final output dataset output_df from this notebook to ensure stable outputs in all downstream talktorials (T002-T007).

Talktorial 3 Unwanted Substructure Brenk Error

Hi @dominiquesydow
In [10] comes up with the following error when I run it. It worked fine about 2 months ago when i actually just got introduced to TeachOpenCADD

ValueError                                Traceback (most recent call last)
<ipython-input-10-3809d0362bf6> in <module>
      1 Chem.Draw.MolsToGridImage(
      2     list(substructures.head(3).rdkit_molecule),
----> 3     legends=list(substructures.head(3).name),
      4 )

/srv/conda/envs/notebook/lib/python3.7/site-packages/rdkit/Chem/Draw/IPythonConsole.py in ShowMols(mols, maxMols, **kwargs)
    197   if not "drawOptions" in kwargs:
    198     kwargs["drawOptions"] = drawOptions
--> 199   res = fn(mols, **kwargs)
    200   if kwargs['useSVG']:
    201     return SVG(res)

/srv/conda/envs/notebook/lib/python3.7/site-packages/rdkit/Chem/Draw/__init__.py in MolsToGridImage(mols, molsPerRow, subImgSize, legends, highlightAtomLists, highlightBondLists, useSVG, returnPNG, **kwargs)
    611     return _MolsToGridImage(mols, molsPerRow=molsPerRow, subImgSize=subImgSize, legends=legends,
    612                             highlightAtomLists=highlightAtomLists,
--> 613                             highlightBondLists=highlightBondLists, returnPNG=returnPNG, **kwargs)
    614 
    615 

/srv/conda/envs/notebook/lib/python3.7/site-packages/rdkit/Chem/Draw/__init__.py in _MolsToGridImage(mols, molsPerRow, subImgSize, legends, highlightAtomLists, highlightBondLists, drawOptions, returnPNG, **kwargs)
    553           del kwargs[k]
    554     d2d.DrawMolecules(list(mols), legends=legends or None, highlightAtoms=highlightAtomLists,
--> 555                       highlightBonds=highlightBondLists, **kwargs)
    556     d2d.FinishDrawing()
    557     if not returnPNG:

ValueError: bad query type1

Talktorial 10 - Input Cell #10

Should be,
pdb_ids = list(set(found_pbd_ids + found_pbd_ids2))

and not,
pdb_ids = list(set(found_pbd_ids + found_pbd_ids))

Talktorial 4 - mistake in calculating experimental EF

In the function print_data_ef in cell 46, we need to convert the inputted percentage perc_ranked_dataset into a fraction to successfully compare with the values in enrich_df.

Also, in cell 47, when using the concatenated dataframe enrich_df, the function print_data_ef takes the last line to find the relevant fraction, so will always choose whichever similarity measure was second in the concatenation (in cell 43). Instead, I think we should just compute the EF for each similarity measure separately.

TalkTorial 1 - T1_ChEMBL - Step 24

Dear all,

being a new user I am struggling in some of the steps of this amazing Talktorial.
At the step 24, I do not manage to proceed to transform my "smile" structures in formula within the template.
PandasTools.AddMoleculeColumnToFrame(output_df, smilesCol='smiles')
Any suggestion?
Thanks

Include new talktorials to binder setup

Upon merge of #74 make sure new talktorials run with binder setup; add notes to those that may not run there (e.g. MD simulation).


This issue is prompted by @Carlbullish's question in #124 (comment):

And if you dont mind me asking, running the binder for talktorial 10 and other talktorials afterwards, is not yet implemented for successful execution as compared to talktorial 1 - talktorial 10?

  • [Resolved, see comments below] Talktorial 010 seems to run fine on Binder; @Carlbullish, did you run into troubles
  • Talktorials >010 will be included in Binder once we fixed some final technical issues in #74

Issue on tutorial T010_binding_site_comparison

I found difficulty to install the tutorial T010 because It seems the MDAnalysis package is not installed properly or numpy package is with an issue. Please, see the error message below:

ValueError                                Traceback (most recent call last)
<ipython-input-5-359cda61cea2> in <module>()
      1 import opencadd
----> 2 from opencadd.structure.core import Structure
      3 from opencadd.structure.superposition.api import align, METHODS
      4 from opencadd.structure.superposition.engines.mda import MDAnalysisAligner
      5 from teachopencadd.utils import seed_everything

5 frames
/usr/local/lib/python3.7/site-packages/MDAnalysis/lib/util.py in <module>()
    215 from ..exceptions import StreamWarning, DuplicateWarning
    216 try:
--> 217     from ._cutil import unique_int_1d
    218 except ImportError:
    219     raise ImportError("MDAnalysis not installed properly. "

MDAnalysis/lib/_cutil.pyx in init MDAnalysis.lib._cutil()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

CI testing

Implement GH Actions to run all notebooks regularly with nbval or similar.

Talktorial index

This index gives an overview on the available TeachOpenCADD talktorials in our upcoming release (with comments on their previous index in TeachOpenCADD v1).

  • 000_template # new
  • 001_query_chembl # T1
  • 002_compound_adme # T2
  • 003_compound_unwanted_substructures # T3
  • 004_compound_similarity # T4
  • 005_compound_clustering # T5
  • 006_compound_maximum_common_substructures # T6
  • 007_compound_activity_machine_learning # T7
  • 008_query_pdb # T8
  • 009_compound_ensemble_pharmacophores # T9
  • 010_binding_site_comparison # T10, fix issue
  • 011_query_online_api_webservices # T11
  • 012_query_pubchem # T11a (second part)
  • 013_query_klifs # T11a (first part); could mention at the end opencadd.databases.klifs/klifs_utils
  • 014_binding_site_detection # T12 (not on TeachOpenCADD yet, but here), we could extract binding site stuff from T11b and merge it with T12 as stand-alone notebook prior to docking
  • 015_protein_ligand_docking # T11b
  • 016_protein_ligand_interactions # T11c
  • 017_python_jupyter_introduction # Revamp AI in Medicine

Timestamp DB queries for input data consistency

If we query the databases often, the input data will change each time (specially with T1), so all the downstream notebooks will be slightly modified. We could filter out data after an arbitrary date (let's say 1/1/2020) to reduce this noise.

When this is addressed, make sure to review the text parts where some output is mentioned (as in "the first result shows a molecule named X").

Talktorial 11b: Problems connecting to OPAL web services

Hi.
Thank you very much for putting together these talktorials. It's aweTsome !
I'm having issues connecting to the OPAL web services. The line client = Client("http://nbcr-222.ucsd.edu/opal2/services/vina_1.1.2?wsdl") in the fonction opal_run_docking(protein, ligand, center, size, stream_output=True) is giving me an ExpatError and SAXParseException.
Any idea on how I could fix that?
Thanks again.

ExpatError Traceback (most recent call last)
~/anaconda3/envs/teachopencadd/lib/python3.6/xml/sax/expatreader.py in feed(self, data, isFinal)
216 # except when invoked from close.
--> 217 self._parser.Parse(data, isFinal)
218 except expat.error as e:

ExpatError: syntax error: line 1, column 0

During handling of the above exception, another exception occurred:

SAXParseException Traceback (most recent call last)
in
1 from suds.client import Client
----> 2 client = Client("http://nbcr-222.ucsd.edu/opal2/services/vina_1.1.2?wsdl")

T015: Docking with python API

Currently our T015 docking talktorial uses smina for docking, which works very well and is installable via conda-forge, but it does not have a Python API. Recently, AutoDock Vina released a new version that has a python API (installation instructions). We could think moving to this package in the future.

Add support for Google Colab

We currently offer running the notebooks

  • locally via Jupyter Lab
  • remotely via Binder

In the future, add support for Google Colab.

Per notebook add cells at the beginning of the notebook to install the dependencies from our teachopencadd environment, e.g. like this (see discussion #129 (comment)):

!pip install condacolab
import condacolab
condacolab.install()
!wget https://raw.githubusercontent.com/volkamerlab/TeachOpenCADD/master/devtools/other-conda-envs/users_env.yml
!mamba env update -n base -f users_env.yml 

However, some dependencies seem to cause problems:

  • T010: numpy version does not match with mdanalysis: #148

Installing TeachOpenCADD on google colab

I am trying to install TeachOpenCADD on google colab using the instruction in the web page:

https://projects.volkamerlab.org/teachopencadd/installing.html

After installing conda in the notebook, the command below gives the following error:

!conda env create -f https://raw.githubusercontent.com/volkamerlab/TeachOpenCADD/master/environment.yml

CondaHTTPError: HTTP 404 NOT FOUND for url https://raw.githubusercontent.com/volkamerlab/TeachOpenCADD/master/environment.yml
Elapsed: 00:00.133251

An HTTP error occurred when trying to retrieve this URL.
The URL does not exist.

Could you give the right URL, please?

I cannot import the packages from teachopencadd. Thanks.

CI: don't forget adding more notebooks to `treebeard.yaml`

As we open PRs, these need to add their new notebooks to treebeard.yaml. This won't be needed once the existing notebooks in tree have been processed. This involves minor changes to the CI pipelines so we just ls the directory instead of cherry picking which notebooks have to undergo testing.

Broken links

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.