brsynth / retropathrl Goto Github PK

Reinforcement Learning based bioretrosynthesis tool

License: MIT License

Python 100.00%

synthetic-biology synbio metabolic-engineering retropath

retropathrl's Introduction

Monte Carlo Tree Search presentation

The aim of this project is to run a Monte Carlo Tree Search to perform bio-retrosynthesis, compatible with mono-component reaction rules from RetroRules (https://retrorules.org). The role of each script is detailed below. Scripts can generally be run from the command line, and have detailed comments for each function.

Reaction rules are available at https://retrorules.org/dl.

Detailed docs are in document_all_options.md.

Chemoinformatics choices are detailed in chemistry_choices.md.

Installation

Compatibility notice: RetroPathRL has been developped and tested using Linux and MacOS platforms. It is expected that RetroPathRL will not work properly using Windows OS.

Setting conda environment

conda create --name MCTS python=3.6
source activate MCTS
conda install --channel rdkit rdkit=2019.03.1.0
conda install pytest
conda install pyyaml

After cloning this Git repo, please run

pip install -e .

at the root of the package.

Visualization of results

Results can be visualised using the stand-alone Scope Viewer available on GitHub at:

git clone https://github.com/brsynth/scope-viewer.git

Toxicity calculator

For using the toxicity calculator:

conda install scikit-learn=0.19.1

DB cache

For using a database to cache results, you can find it on GitHub:

conda install pymongo
git clone https://github.com/brsynth/rp3_dcache.git

Then run pip install -e . at the root of the downloaded package. Check detailed instructions in the DB cache repository for instructions on how to set up and run the cache database.

Set-up data files

You need to specify in the config.py file where you want to store the data (data_path).
Unless specified, it will be stored in the package folder/data
Organisms will be stored in [data_path]/organisms (data_path should not end with a /) while rules in json format with ECFPS (optional) will be in [data_path]/rules

Run the following commands:

python calculate_rule_sets_similarity.py --rule_address_with_H your_rule_address --rule_address_without_H your_rule_address
python calculate_organisms.py

Testing

Important: Tests have to be executed in the root folder of the program, which contains the tests folder.

python change_config.py --use_cache True --add_Hs True
pytest -v

Command line examples

python change_config.py  \
    --use_cache True

python Tree.py \
    --log_file tree.log \
    --itermax 1000 \
    --expansion_width 10 \
    --time_budget 7200 \
    --max_depth 7 \
    --UCT_policy Biochemical_UCT_1 \
    --UCTK 20 \
    --bias_k 0 \
    --k_rave 0 \
    --Rollout_policy Rollout_policy_random_uniform_on_biochemical_multiplication_score \
    --max_rollout 3 \
    --chemical_scoring SubandprodChemicalScorer \
    --virtual_visits 0 \
    --progressive_bias_strategy 0 \
    --diameter 10 12 14 16 \
    --c_name deoxiviolacein \
    --c_inchi "InChI=1S/C20H13N3O2/c24-19-13(18-12-6-2-4-8-16(12)22-20(18)25)9-17(23-19)14-10-21-15-7-3-1-5-11(14)15/h1-10,21H,(H,22,25)(H,23,24)/b18-13+" \
    --folder_to_save deoxi_07_no_H\
    --biological_score_cut_off 0.1 \
    --substrate_only_score_cut_off 0.7 \
    --chemical_score_cut_off 0.7 \
    --minimal_visit_counts 1

Expected result from this command is a folder named 'expected_results' containing:

a text file called 4_results. This allows easy parsing of the fact that 4 results were generated when running the search.
json files called deoxiviolacein_N.json, with N from 1 to 4: files containing pathways under json format. There is no ordering of those files.
deoxiviolacein_best.json: contains the best pathway (ie: with the most visits)
deoxiviolacein_full_scope.json: contains all pathways in the same json file.
deoxiviolacein_full_tree_for_MCTS.json: contains the Tree under json format. This allows visualisation node by node using our tree viewer available at https://github.com/brsynth/scope-viewer.
deoxiviolacein_iteration_N.json: contains the pathway found at iteration N. Folder should contain 4 such files.
pickles is a folder containing tree_end_search.pkl: the full pickled tree. This can be used for tree extension or analysis.
results.csv contains the configuration and the results from this configuration (STOP_REASON should be iteration here)
tree.log contains execution logs. Except for the json files containing the word 'tree', all json files can be visualised using our scope viewer available at https://github.com/brsynth/scope-viewer.

Example for extension and 'normal' search

We expect no result from this search:

python change_config.py --DB_CACHE True --DB_time 0  --use_cache True

python Tree.py \
    --log_file tree.log \
    --itermax 1000 \
    --expansion_width 10 \
    --time_budget 7200 \
    --max_depth 7 \
    --UCT_policy Biochemical_UCT_1 \
    --UCTK 20 \
    --bias_k 0 \
    --k_rave 0 \
    --Rollout_policy Rollout_policy_random_uniform_on_biochemical_multiplication_score \
    --max_rollout 3 \
    --chemical_scoring SubandprodChemicalScorer \
    --virtual_visits 0 --progressive_bias_strategy 0 \
    --diameter 10 12 14 16 \
    --c_name deoxiviolacein \
    --c_inchi "InChI=1S/C20H13N3O2/c24-19-13(18-12-6-2-4-8-16(12)22-20(18)25)9-17(23-19)14-10-21-15-7-3-1-5-11(14)15/h1-10,21H,(H,22,25)(H,23,24)/b18-13+" \
    --folder_to_save test_tree_extension/deoxi_09 \
    --biological_score_cut_off 0.9  \
    --substrate_only_score_cut_off 0.9 \
    --chemical_score_cut_off 0.9 \
    --minimal_visit_counts 1

To rerun from the same Tree with a more tolerant score

The following command will extend the tree by 10 children. What that means is that a node that had 10 children already can have up to 10 other children added. A node that had only 5 can have up to 15 children added (original:10 plus extension:10). Morevoer, all node scores (visits and values) are reinitialised, as they can change drastically by allowing new rules. Only the structure is conserved, which allows for much faster descent on already expanded nodes. We expect 1 pathway from this search.

python change_config.py --DB_CACHE True --DB_time 0  --use_cache True

python Tree.py  \
    --log_file tree.log  \
    --itermax 1000  \
    --expansion_width 10 \
    --time_budget 7200 \
    --max_depth 7 \
    --UCT_policy Biochemical_UCT_1 \
    --UCTK 20 \
    --bias_k 0 \
    --k_rave 0 \
    --Rollout_policy Rollout_policy_random_uniform_on_biochemical_multiplication_score \
    --max_rollout 3 \
    --chemical_scoring SubandprodChemicalScorer \
    --virtual_visits 0 \
    --progressive_bias_strategy 0 \
    --diameter 10 12 14 16 \
    --folder_to_save test_tree_extension/deoxi_05 \
    --tree_to_complete end_search \
    --folder_tree_to_complete test_tree_extension/deoxi_09 \
    --biological_score_cut_off 0.1  \
    --substrate_only_score_cut_off 0.5 \
    --chemical_score_cut_off 0.5 \
    --minimal_visit_counts 1

Exploiting the DB

The DB is used as a cache: each time the application of a rule on a compound is run and takes more than DB_time, it is stored in that database.

Visualise: [http://localhost:8081][http://localhost:8081]
Config: in config.py file, imported as a module in all scripts

Supplement finder

The aim of the supplement_finder script is to find potential media supplements that would allow to make other pathways by simple media supplementation. It is currently limited to 1 supplement to avoid combinatorial explosion. It allows for verification of presence in a database of interest (here: Metanetx), previously standardised under the same conditions as the Tree (with or without hydrogens/stereo).

Please unzip the databases in data/supplement_finder/data before running this script, as well as the search tree in data/supplement_finder/tree_for_testing/TPA/pickles.

Usage:

python supplement_finder.py --folder_tree_to_complete data/supplement_finder/tree_for_testing/TPA \
--database_address data/supplement_finder/data/metanetx_extracted_inchikeys.json \
--folder_to_save testing_supplement_finder/TPA

Remarks on the config file

config.py template is located at clean_data/base_config.py and is used to generate a config.py instance used by all jobs.
config.py is read by all job instances
editing config.py while jobs are still running will impact not only new jobs but also job that are still running

Files organisation

Each file will contain its own class.
Firing routines are taken from T.D.' reactor package.
Tests are in a separate folder and have filenames starting with test prefix, as well as class names starting with Test.
Data for the tests is also in the Tests folder.
DB is taken from T.D.' DBcache.
Config file contains a number of global parameters needed in various scripts, mostly to decide which features to use (score cut-offs, DB, caching, progressive widening etc)

Important:

a pickled_data folder needs to be present in the root of the MCTS (for running tests).
organisms generation (bottom of chemical_compound_state file) should be run once.

Description of object classes

Biological scoring: contains obtains that score a biological rule - a move in MCTS (currently, either randomly or from a dictionnary of scores)
calculate_organisms: run once at set up to extract organisms under correct format.
calculate_rule_sets_similarity: run once to set up the rules with original substrates and products for similarity calculations. Pickled rules.
change config: change the config file from command line.
Chemical Compound state: contains a wrapper around the compound class. It selects the best available moves of a state, checks terminality etc. The basic MCTS object. It also creates the organisms which are states.
Chemical scoring: utilities for chemical scoring.
Chemistry choices: documents important chemistry choices made during the project.
Compound: wrapper around rdkit mol. Basic object containing all transformations, sorting of rules. That is where the chemistry happens. Currently contains archive functions for parallelisation that will be simplified one day.
compound scoring: at the moment, contains scoring for toxicity bias.
Config: Allows for configuration of MCTS parameters used throughout the tree running. Also generates the logs for this.
convert to SBML: converts a JSON file for a pathway to an SBML file.
data: contains the data necessary to run RetroPath3.0
expected_results: cotnains expected results from running the first example udner Testing section.
MCTS_node: has a state as attribute, but also keeps information of father, son, number of visits, rollout results and so on. This is where most algorithmic improvements are encoded.
Move: contains the move - ie: mostly a wrapper around a chemical rule, with pointing compound and products, as well as scores and RAVE results. _ organisms: wrapper around organisms used as sinks for the retrosynthetic search.
pathway: can store pathways, with utils to add compounds and reactions and export as a json. Is also be compatible with scorers.
pathway_scoring: used to score pathways.
representation: a class defining how to represent nodes, states and compounds (text vs colors in the Terminal).
rewarding: a class containing rewards after a rollout.
Rollout_policies: contains different ways to sample the rollout (random, proportional to the chemical score, the biological score, combination thereof etc)
rule_set_sexamples: imports rule from csv files. Used for basic examples, does not allows for similarity scoring.
rule_sets_similarity: imports rules under a similarity compatible format.
setup: for compatibility with pip.
supplement_finder: allows finding of supplements in Trees.
tests: contains tests for installation.
Tree: performs the tree search: mostly iterating through the MCTS node objects a defined number of times. Utilities for pathways, saving etc. This is the script to use to run the MCTS search.
tree_viewer: save tree under a viewer compatible format.
UCT_policies: contains different ways to select the best child (classical, proportional to the chemical score, the biological score, combination thereof etc)
utilities: contains utilities to apply reaction rules and standardise compounds.

Rule input formatting

Rules are imported from rule_sets_similarity after calculation with calculate_rule_sets_similarity.py. The user can define his own import to replace the default import method.

A rule set as input with calculate_rule_sets_similarity.py needs to have the following characteristics...

be a csv file that is tab delimited.

... and possess the following keys:

Rule_ID
Reaction_ID: ID of the reaction the rule was learned on. Field cannot be empty.
Rule_SMARTS: mono-component reaction SMARTS as described in Duigou et. al., Nucleic Acids Research, 2019.

Optional keys are:

Diameter: diameter around the reaction center. If absent, set to 0.
Substrate_ID and Substrate_SMILES: used for chemical scoring. If smiles is given, ID also needs to be provided. If absent, score will be 1.
Product_IDs and Product_SMILES: used for chemical scoring. If smiles is given, ID also needs to be provided. If absent, score will be 1. Remark: chemical score is disabled if either substrate or product is missing.
Rule_SMILES: A SMILES depiction of the reaction rule. If missing, set to empty string.
Score_normalized: biological score of the reaction. Set to 1 if absent.
Reaction_EC_number: specifies the EC number of the reaction used as template. Set to unspecified if absent.
Rule_usage: possible values are retro, forward or both. Should the reaction be considered for retrosynthesis, forward usage or both. Set to both if unspecified.

MCTS improvements currently implemented.

minimal number of visits per child
using transposition tables (uses too much memory for some reason).
use RAVE
progressive bias (biasing exploration by giving initial values to nodes)
progressive widening: allow number of children roughly proportional to number of visits of the node
virtual visits: give a number of virtual visits to avoid too much stochasticity in initial evaluations

Biosensor working example

We expect one result from this search.

python change_config.py --DB_CACHE True --DB_time 0  --use_cache True --add_Hs True --biosensor True

python Tree.py  \
    --log_file tree.log \
    --itermax 1000  \
    --expansion_width 20 \
    --time_budget 7200 \
    --max_depth 2 \
    --UCT_policy Biochemical_UCT_1 \
    --UCTK 20 \
    --bias_k 0 \
    --k_rave 50 \
    --Rollout_policy Rollout_policy_random_uniform_on_biochemical_multiplication_score \
    --max_rollout 3 \
    --chemical_scoring SubandprodChemicalScorer \
    --virtual_visits 0 \
    --progressive_bias_strategy max_reward  \
    --diameter 10 12 14 16 \
    --c_name pipecolate \
    --c_inchi "InChI=1S/C6H11NO2/c8-6(9)5-3-1-2-4-7-5/h5,7H,1-4H2,(H,8,9)" \
    --folder_to_save pipecolate \
    --EC_filter 1.5.3.7 1.5.3 \
    --biological_score_cut_off 0.1  \
    --substrate_only_score_cut_off 0.9 \
    --chemical_score_cut_off 0.9 \
    --minimal_visit_counts 1

Various remarks

Best move selection:

using a multiplication of biological score
and chemical score based on similarity:
- similarity towards initial compound to order the rules that will be tested
- similarity also to products for real move ordering after rule has been applied.

Standardisation:

When loading sink and source compounds, will go through 'heavy' standardisation
The rest (within the tree search) will go through normal standardisation.

retropathrl's People

Contributors

Stargazers

Watchers

Forkers

meono zzp12 peach1mol mdavari sasengupta cihernand zihua justindoit ipark2021 dot23 lyndonlens rnaimehaom yuanxiaoyu1 long-nicholas yaokaibb

retropathrl's Issues

FileNotFoundError:full_rules_forward_H.pkl

When I run the Testing and Command line examples I find that a file called full_rules_forward_H.pkl is always missing.
FileNotFoundError: [Errno 2] No such file or directory: '/home/madeline/MCTS/RetroPathRL/data/rules/full_rules_forward_H.pkl'

Sculpting the available pathway

This is not an issue, but a feature request/question:

I want to block certain pathways, so that I can remove certain intermediates from the sinks. Say I want to avoid glycerol,glucose,NADH,etc. from contributing to the sinks, but I don't want to block them from acting as an intermediate. How can I do that?

What I've done so far was to edit one of the sink files and remove the offending sink chemicals. Then edit the calculate_organisms.py, Tree.py and organisms.py files slightly and add this new set of sinks as a new sink/organism so it can generate its pickles (What IS that??) and go about it's business. Is this the right way to do it? If something is not in the sinks, can it still be an intermediate of a pathway? Is there a better way to do this?

Bonus questions: How can I bias the moves? I can't find much about the toxicity bias and how it's applied? Am I missing something in the papers/supplements?

Again, this has a lot of potential owing to its flexibility. Thank you!

Products in reaction nodes

Hi,

I'm looking at some of the pathways generated from retropath rl run. I can see all substrates and their stoichiometry in reaction nodes. Product side isn't as clear. Only one product is linked to the node and that's usually not enough to have a balanced reaction to be used in downstream analysis. Is there a way to get complete reaction definitions or am i missing a point here?

Thanks

Is it possible to use your tools in reverse?

Hello!
thanks for developing such an amazing tool!
I feel that your software is usually used to screen for pathways to synthesize certain products.
Now, I tried to search through your software to find out the pathways by which E. coli consumes estrogens.
I did the following works:

add the estrogen information to sink file(iJO1366.csv)
both using the rule file of RetroPath tutorial_data and retrorules_rr02_rp2_flat_all.csv
modified source.csv by adding succinate into it(assuming the estrogen will finally enter TCA)

I feel that there is a big problem with this method, because sink is likely to generate succinate through other substances inside.
So it is possible to do this work through RetroPath? I beg you provide some suggestion about this. Thank you!

Errors in Set-up data files

Hello,

First of all, thanks for this amazing tool. I'm getting errors when I run python calculate_organisms.py . Previously, I've downloaded the retrorules from https://retrorules.org/dl and did not get any problem. The errors looks like this:

Then, when I run pytest -v I get 44 failed, 67 passed tests. I suppose it is motivated by the errors in the image above. Something strage is that I do not have the folder 'organisms' in 'data'.

Thanks in advance.

Missing documentation file

document_all_options.md is mentioned in README.md but not available in the repository. Would be great to have it as there are a lot of parameters that I'm not familiar with.

rules_rall.tsv

Hi,

I was going through the instructions and came across two absolute paths in "Set-up data files" part:

/mnt/hdd/mkoch/data/retrorules_11_06_19/mnx_20190524/nostereo_hs/rule/aroaam_final/rules_rall.tsv
/mnt/hdd/mkoch/data/retrorules_11_06_19/mnx_20190524/nostereo_nohs/rule/aroaam_final/rules_rall.tsv

I assume the "RetroPath RL" versions from https://retrorules.org/dl work for this step.

some errors happened when run the command 'pytest -v'

Hi, @tduigou
when I tried to run the command 'pytest -v', I got some failed information:
=============================================== short test summary info =============================================== FAILED tests/test_MCTS_node.py::TestMCTSNode::test_proper_root_initialisation - AttributeError: module 'signal' has n... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_terminal_state_false - AttributeError: module 'signal' has no attr... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_terminal_state_true_unavailable - AttributeError: module 'signal' ... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_proper_child_addition_821_rule_94682_moves - AssertionError: asser... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_proper_child_addition_821_rule_94682_moves_state - UnboundLocalErr... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_proper_child_addition_821_rule_117465_moves - UnboundLocalError: l... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_expand_821 - UnboundLocalError: local variable 'i_94682' reference... FAILED tests/test_MCTS_node.py::TestMCTSNode::test_expand_90191 - IndexError: list index out of range FAILED tests/test_MCTS_node.py::TestMCTSNode::test_node_expanded - assert [] != [] FAILED tests/test_MCTS_node.py::TestMCTSNode::test_node_rollout_possible - assert |NBBJYMSMWIIQGU-UHFFFAOYSA-N| != |N... FAILED tests/test_cli.py::test_run_with_timeout - AttributeError: module 'signal' has no attribute 'SIGKILL' FAILED tests/test_compound.py::TestCompound::test_apply_transformation_sets - assert (False) FAILED tests/test_compound.py::TestCompound::test_apply_transformation_sets_parallel - assert (False) FAILED tests/test_compound.py::TestCompound::test_apply_transformation_sets_different_products - IndexError: list ind... FAILED tests/test_compound.py::TestCompound::test_apply_transformation_sets_different_products_parallel - IndexError:... FAILED tests/test_compound.py::TestCompound::test_similarity_substrate_and_product_equal_1 - IndexError: list index o... FAILED tests/test_compound.py::TestCompound::test_similarity_substrate_and_product_different_1 - IndexError: list ind... FAILED tests/test_compound.py::TestCompound::test_stoechiometry_3 - AttributeError: module 'signal' has no attribute ... FAILED tests/test_compound.py::TestCompound::test_rule_with_timeout - AttributeError: module 'signal' has no attribut... FAILED tests/test_state.py::TestState::test_GetMoves - AssertionError: assert set() == {'MNXR103108_...13_MNXM90191'} FAILED tests/test_state.py::TestState::test_ApplyMoves - UnboundLocalError: local variable 'i' referenced before assi... ====================================== 21 failed, 98 passed in 134.91s (0:02:14) ======================================

before this, I haved modified the code as you committed to the issues. And I have also installed the MongoDB cache from brsynth/rp3_cache depot.

Did I omit something when operated according to README.md?
Thanks so much!

ECFP of the first substrate only taken into account in calculate_rule_sets_similarity.py

Bug spotted thanks to Esther Heid: in merge_rule_characteristics function of the calculate_rule_sets_similarity.py the function is merging only identifiers and not the ECFPs, which at the end result in only the first substrate ECFP being considered.

This should not be the case and ECFPs of all compound involved in similar rules should be merged, so that all can be investigated when computing the chemical score -- by comparing the actual transformation with the template reactions used to generate the rules.

Reminder: the chemical score is designed to return the maximum score of all those comparisons.

Unable to set up repo because of missing rule files

Good afternoon.

I have gone through the installation guide. Namely, I have run:

conda create --name MCTS python=3.6
conda activate MCTS
conda install --channel rdkit rdkit=2019.03.1.0
conda install pytest
conda install pyyaml

pip install -e .

But when I want to run:

python calculate_rule_sets_similarity.py --rule_address_with_H your_rule_address --rule_address_without_H your_rule_address

It gives me an error because your_rule_address, as the help manual for the function indicates is the rule file with(out) hydrogen that will be used. What is this file? Are these the chemical rules that RetroPathRL uses? If so, am I supposed to get them myself? It is not very clear how to proceed. Please I would appreciate if somebody found lend me a hand here.

captain@admin?

This looks like a cool and flexible package. Thank you. I can get it to run without the DB fine, but when I turn it on, I get an "authentication failed" error, and the MongoDB service is telling me that a user "captain@admin" is trying to connect. Does this ring any bells? Is it my problem or may be is there a hard-coded user/domain pair in the code?

ERROR rp3_dcache/test/test_Manager.py

Hi,
I tried to install the package today and run the pytest -v. The test failed to import module named 'rp3_dcache.Manager' and 'rp3_dcache.Utils', but i have clone the https://github.com/brsynth/rp3_dcache.git.
How can I solve this error?
============================= test session starts ==============================
platform linux -- Python 3.6.13, pytest-6.2.4, py-1.11.0, pluggy-0.13.1 -- /home/ncku2/anaconda3/envs/RetropathRL/bin/python
cachedir: .pytest_cache
rootdir: /home/ncku2/jjj/RetropathRL/RetroPathRL
collected 111 items / 2 errors / 109 selected

==================================== ERRORS ====================================
_______________ ERROR collecting rp3_dcache/test/test_Manager.py _______________
ImportError while importing test module '/mypath/RetroPathRL/rp3_dcache/test/test_Manager.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../anaconda3/envs/RetropathRL/lib/python3.6/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
rp3_dcache/test/test_Manager.py:2: in
from rp3_dcache.Manager import Manager
E ModuleNotFoundError: No module named 'rp3_dcache.Manager'
________________ ERROR collecting rp3_dcache/test/test_Utils.py ________________
ImportError while importing test module '/mypath/RetroPathRL/rp3_dcache/test/test_Utils.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../anaconda3/envs/RetropathRL/lib/python3.6/importlib/init.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
rp3_dcache/test/test_Utils.py:4: in
from rp3_dcache.Utils import default_config, make_document_id, as_document, rdmols_from_document
E ModuleNotFoundError: No module named 'rp3_dcache.Utils'
=========================== short test summary info ============================
ERROR rp3_dcache/test/test_Manager.py
ERROR rp3_dcache/test/test_Utils.py
!!!!!!!!!!!!!!!!!!! Interrupted: 2 errors during collection !!!!!!!!!!!!!!!!!!!!
============================== 2 errors in 2.82s ===============================

python calculate_rule_sets_similarity.py

Hello,

I am not sure what to do at this step:
python calculate_rule_sets_similarity.py --rule_address_with_H your_rule_address --rule_address_without_H your_rule_address

Error code:
By default, logs are saved in /Users/armk/Documents/Firmenich/retropathrl/RetroPathRL/data/rules/logs_rules_set_up.log. Please use --terminal to redirect to sys.stderr
Traceback (most recent call last):
File "calculate_rule_sets_similarity.py", line 632, in
run(rule_address_with_H = args.rule_address_with_H, rule_address_without_H = args.rule_address_without_H, rm_stereo = not args.stereo)
File "calculate_rule_sets_similarity.py", line 153, in run
with open(rule_address_with_H, "r") as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: 'your_rule_address'

Uncaught TypeError: list_stoechiometry is None instead of []

Hi @tduigou,

Got this uncaught exception while doing some tests:

Traceback (most recent call last):
  File "../RetroPathRL/Tree.py", line 1584, in <module>
    __cli()
  File "../RetroPathRL/Tree.py", line 1456, in __cli
    pathway_scoring=args.pathway_scoring)
  File "/home/retropath/Documents/RetroPathRL/Tree.py", line 288, in __init__
    use_RAVE=use_RAVE)
  File "/home/retropath/Documents/RetroPathRL/MCTS_node.py", line 126, in __init__
    self.moves = self.state.GetMoves(top_x = self.expansion_width, chemical_score = chemical_score)
  File "/home/retropath/Documents/RetroPathRL/chemical_compounds_state.py", line 413, in GetMoves
    extension = extension)
  File "/home/retropath/Documents/RetroPathRL/compound.py", line 817, in obtain_applicable_transformation_with_move
    list_stoechiometry = list_stoechiometry)
  File "/home/retropath/Documents/RetroPathRL/compound.py", line 144, in _moves_from_rdmols
    assert len(list_of_moves) == len(list_stoechiometry)
TypeError: object of type 'NoneType' has no len()

By playing around with prints, I saw that list_stoechiometry was set to None for whatever reason instead of [] as it is obviously expected from the code. I am not sure how bad it is though.

Its seems that it was stored like that in the (MongoDB) database since the list_stoechiometry value is inherited from _ask_DB()... but I am not sure if having a value for rdmols_from_DB and not for list_stoechiometry is expected behavior line 73 in compound.py.

Important: I installed the MongoDB cache from brsynth/rp3_cache depot, but I had to modify RetroPathRL's config.py to use the package as it is expecting a "rp3_cache" and the actual name is "rp3_dcache" (which is another issue).

My ugly fix is to correct the type in _moves_from_rdmols() line 93:

...
    if clean_up:
        list_stoechiometry= []
    elif list_stoechiometry is None:  # for whatever reason
        list_stoechiometry = []
...

HTH, ++

"WRITE" placeholder in README

Hello @tduigou,

Il y a des "WRITE" dans le README qui sont, j'imagine, des placeholders pour des liens ou autre.

Par ailleurs, quel est le status de développement de RP3 ? Est-ce que tu serais intéressé par une liste de ce que je trouve contre-intuitif à l’installation et à l'usage ? Réponds-moi par MP si tu préfères.

Merci !

Pip install step

Hello,

I am stuck on this step (Error code below). Where do I find the root dir to run this?

Thank you!

After cloning this Git repo, please run

pip install -e .

ERROR: File "setup.py" or "setup.cfg" not found. Directory cannot be installed in editable mode:

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.