Giter Club home page Giter Club logo

graph2edits's People

Contributors

jamson-zhong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

graph2edits's Issues

Questions about generate edits

In utils/generate_edits.py, line 115
a1, a2 is sorted when generate leaving groups and the attaching ligand may be detected only when a1 not in atoms_only_in_react and a2 in atoms_only_in_react. Is there possibility that a2 not in atoms_only_in_react and a1 in atoms_only_in_react? A case:

>>>smi = '[C:2][C:1][C:3][C:4][C:5]>>[C:1][C:3][C:4][C:5]'
>>>generate_reaction_edits(smi)
ReactionData(rxn_smi='[C:1]([C:2])[C:3][C:4][C:5]>>[C:1][C:3][C:4][C:5]', edits=[('Attaching LG', '*[C]'), 'Terminate'], edits_atom=[1], rxn_class=None, rxn_id=None)

>>> smi = '[C:1][C:2][C:3][C:4][C:5]>>[C:2][C:3][C:4][C:5]'
>>> generate_reaction_edits(smi)
ReactionData(rxn_smi='[C:1][C:2][C:3][C:4][C:5]>>[C:2][C:3][C:4][C:5]', edits=['Terminate'], edits_atom=[], rxn_class=None, rxn_id=None)

I think these 2 reactions are same and the edits should not be related to map number.

Another problem is about atoms only products. Error raised in line 90 or line 97. (Detaching ligand?)

>>> smi = '[C:1][C:2][C:3][C:4][C:5]>>[C:6][C:2][C:3][C:4][C:5]'
>>> generate_reaction_edits(smi)

Questions about USPTO-full preparation

I the article, author said: "The same procedure was used to build the edits vocabulary on USPTO-full dataset and the difference is that the edits Attach LG must appear at least 50 times in the training set of USPTO-full before it will be collected into the vocabulary. This edits vocabulary include 6 bond edits, 336 atom edits (8 Change Atom and 328 Attach LG), and a termination symbol."

Apart from processing training data, will test data for leaving the group that is not in the vocabulary be deleted?

How to increase the batch size

train_data = train_dataset.loader(
batch_size=1, num_workers=args['num_workers'], shuffle=True),I want to modify the batch size, but I encountered an error. Additionally, I've noticed that the memory usage is quite low when using the new version of torch. Do you have any idea why this is happening? Thank you for your assistance

Question about the reproduced results

Appreciate the insightful work, but when I reproduce the top-k evaluation on USPTO-50k dataset on RTX-4090. The reproducing commands I used are:

python eval.py --experiments 27-06-2022--10-27-22 
python eval.py --use_rxn_class --experiments 30-06-2022--00-19-29 

the results performed by the trained model you've released are listed below.
| Top1(%) | Top3(%) | Top5(%) | Top10(%) | Comment |
| 45.2 | 64.4 | 70.4 | 77.5 | RXN known |
| 55.0 | 72.8 | 77.4 | 80.4 | RXN known |
I found there is no difference in hyper-parameters setting comapred to your original paper. So I wonder if there is something wrong in reproducing procedure that causes obvious gap between the results above and those in the paper?

Missing license

Thanks for releasing your code and model! I'm interested in trying it out, but couldn't find a LICENSE file, could you add one? Ideally MIT License, which is perhaps the most common choice for similar repositories nowadays.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.