jamson-zhong / graph2edits Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
In utils/generate_edits.py
, line 115
a1, a2 is sorted when generate leaving groups and the attaching ligand may be detected only when a1 not in atoms_only_in_react and a2 in atoms_only_in_react
. Is there possibility that a2 not in atoms_only_in_react and a1 in atoms_only_in_react
? A case:
>>>smi = '[C:2][C:1][C:3][C:4][C:5]>>[C:1][C:3][C:4][C:5]'
>>>generate_reaction_edits(smi)
ReactionData(rxn_smi='[C:1]([C:2])[C:3][C:4][C:5]>>[C:1][C:3][C:4][C:5]', edits=[('Attaching LG', '*[C]'), 'Terminate'], edits_atom=[1], rxn_class=None, rxn_id=None)
>>> smi = '[C:1][C:2][C:3][C:4][C:5]>>[C:2][C:3][C:4][C:5]'
>>> generate_reaction_edits(smi)
ReactionData(rxn_smi='[C:1][C:2][C:3][C:4][C:5]>>[C:2][C:3][C:4][C:5]', edits=['Terminate'], edits_atom=[], rxn_class=None, rxn_id=None)
I think these 2 reactions are same and the edits should not be related to map number.
Another problem is about atoms only products. Error raised in line 90 or line 97. (Detaching ligand?)
>>> smi = '[C:1][C:2][C:3][C:4][C:5]>>[C:6][C:2][C:3][C:4][C:5]'
>>> generate_reaction_edits(smi)
I the article, author said: "The same procedure was used to build the edits vocabulary on USPTO-full dataset and the difference is that the edits Attach LG must appear at least 50 times in the training set of USPTO-full before it will be collected into the vocabulary. This edits vocabulary include 6 bond edits, 336 atom edits (8 Change Atom and 328 Attach LG), and a termination symbol."
Apart from processing training data, will test data for leaving the group that is not in the vocabulary be deleted?
train_data = train_dataset.loader(
batch_size=1, num_workers=args['num_workers'], shuffle=True),I want to modify the batch size, but I encountered an error. Additionally, I've noticed that the memory usage is quite low when using the new version of torch. Do you have any idea why this is happening? Thank you for your assistance
Appreciate the insightful work, but when I reproduce the top-k evaluation on USPTO-50k dataset on RTX-4090. The reproducing commands I used are:
python eval.py --experiments 27-06-2022--10-27-22
python eval.py --use_rxn_class --experiments 30-06-2022--00-19-29
the results performed by the trained model you've released are listed below.
| Top1(%) | Top3(%) | Top5(%) | Top10(%) | Comment |
| 45.2 | 64.4 | 70.4 | 77.5 | RXN known |
| 55.0 | 72.8 | 77.4 | 80.4 | RXN known |
I found there is no difference in hyper-parameters setting comapred to your original paper. So I wonder if there is something wrong in reproducing procedure that causes obvious gap between the results above and those in the paper?
Thanks for releasing your code and model! I'm interested in trying it out, but couldn't find a LICENSE
file, could you add one? Ideally MIT License, which is perhaps the most common choice for similar repositories nowadays.
I have some target compounds(SMILES), but I have no idea how to use your tool to break it down into some reactants.Any help would be appreciated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.