Giter Club home page Giter Club logo

caster's People

Contributors

kexinhuang12345 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

caster's Issues

How to generate unsup_datatest.csv

Hi, I think this project is great, I'm very interested in it and would like to ask you some questions.
(1) How is the unsup_dataset.csv file obtained? It is directly read in the code file run_dde.py, but I don't know how it is generated,especially the index inside.If I want to pretrain the model, are there any other requirements for the amount of unlabeled data?
(2) If I want to get p(x) just like which in fig4 in the paper, how should I add code to make it output? Is it obtained in run_dde.py?
I am looking forward to your reply.

About 'training server setup' problem

Hi Kexin,

When I run your codes, I met some problems. The screenshot of occurred error is as follow :

image

My server setup is Intel(R) Xeon(R) E5-2630 v4 2.20GHz CPU, 110 GB RAM and 2 TITAN Xp 12G. As you said in your paper, the training use a server with 2 Intel Xeon E5-2670v2 2.5GHz CPUs, 128 GB RAM and 3 NVIDIA Tesla K80 GPUs. Is it hard limit to run experiment successfully? Can you provide me some suggestions?

Thanks and regards.

Yujie

Label_Multi in dataset

Dear Kexin,
The drugbank dataset you used has a column named Label_Multi, could you please tell me the meaning and source of the column?

'unsup_dataset.csv' dataset problem

Hi Kexin,

'CASTER' is a wonderful work. I am wondering if you could kindly provide 'unsup_dataset.csv' dataset in your code. It will help me re-implement your algorithm.

Thanks and regards.

Yujie

Some problems about the SMILES

Very great work! We are also doing research on drug interaction.
First, in the ESPF, the chembl_seq.txt is not given. I think it is a txt file with all the SMILES in your dataset, right?
Also, in the paper of CASTER, the SMILES of the Drug:Melatonin is CC(=O)NCCC1=CNc2c1cc(OC)cc2. While I search for Drugbank, it is COC1=CC=C2NC=C(CCNC(C)=O)C2=C1.(https://www.drugbank.ca/drugs/DB01065) What is the difference between these two SMILES? In fact, there is no lower case in the SMILES in Drugbank, so is it still useful to perform the chemical sequential pattern mining Algorithm?

How to get the scores as Fig.4 in the paper did?

Hi, Kexin.
I'm working on using CASTER to do some interpretability work. I want to know how you get the scores for Fig.4?
In my opinion, the scores are the code * 100, where code is the second output of the model_nn:
recon, code, score, Z_f, z_D = model_nn(v_D)
And I want to examine the scores for the interaction between Isosorbide Mononitrate and Sildenafil, I wrote the code:

 
   a = 'O=[N+]([O-])O[C@@H]1CO[C@@H]2[C@@H](O)CO[C@H]12'
   b = 'CCCc1nn(C)c2c(=O)[nH]c(-c3cc(S(=O)(=O)N4CCN(C)CC4)ccc3OCC)nc12'
   test = np.array([smiles2vector(a, b)] * 32)  #It is 32 because I use 32 as the batch size.
   test = torch.from_numpy(test).to(device).float()
   test_recon, test_code, test_score, test_Z_f, test_z_D = model_nn(test)
   scores = test_code.detach().cpu().numpy()*100  

And I check the scores, and found that the scores for O=N+ is -2.7843778. (I loaded the "model_train_checkpoint_SNAP_EarlyStopping_SemiSup_Full_Run1.pt".) It isn't the highest score among all the substructures. And it will change greatly as the model get trained (within few iterations).
So how can I reproduce the result as Fig.4 in the paper did?

ask for help

Hello, I am very interested in this project of yours. I would like to ask you some questions. I would like to know whether to use the preprocessed data of the ESPF project and then use it for CASTER? Is model_pretrain_checkpoint_1.pt in the script run_dde.py generated in this script? It is still generated in another script, and the reference is loaded in it. If it is generated in another script, is there the original script file? There is an unsup_datatest.csv file in the data file, is this an unlabeled dataset? If you can see this question, please reply, thank you very much!

Lack of Dataset

Dear Kexin,

In this file "Run_Explainability_Models", shows the lack of a dataset about "/data/deepDDI_small/fold2/df_ddi_train_val.csv".

Could you upload the directory of "/scratch/kh2383/DFI_data/data/deepDDI_small/"?

Thanks a lot.
Jam

dataset requeset

Hi, I've seen the paper, CASTER. And be interested in doing research in this field. So I would like to ask whether the drugBank data set is available. Can I get your dataset. look forward to your reply.

How to get the roc-auc on BIOSNAP?

Hi, Kexin.
I'm working on using CASTER to do some research. I want to know how you get the ROC-AUC on BIOSNAP? When I run the code, it is not 0.910, it is 0.964.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.