mp2893 / gram Goto Github PK

View Code? Open in Web Editor NEW

238.0 238.0 69.0 22 KB

Graph-based Attention Model

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

gram's People

Contributors

Stargazers

Watchers

gram's Issues

Empty level1.pk when working with the new version of MIMIC

Hi,
I have been trying to run your code by following the instruction (yet working with ICD9 hierarchy rather than CCS which I'm sure works fine). However, it turned out that for the new version of the MIMIC the generated level1.pk would be empty, so I have been getting errors from gram.py as it assumes all levelX.pk files are non-empty to construct the tree and the attention model. Can you please help me with that (as it can be a common case in EHR where there could be no higher-level ICD9s assigned to the patients)?
Thanks in advance

some question about level2.pk and ancestors

Hello, sorry to bother you.
Recently I was researching your thesis and source code, and I want to apply your model to other fields.
But since I am not a medical professional, I find it very difficult to view the CCS forms and the .pk files generated by build_trees.
What I want to ask you is: why the inputSize of train_glove () in glove.py depends on the [0] [1] element in level2.pk file generated by build_trees.py instead of level1.pk or level3.pk, etc. ?
My question may be a bit stupid, but if you can enlighten me, thank you very much!!

query about the num of ancestors

hello，I wonder that how to deal with the problem that a child has different numbers of ancestor?Is there a mask?

Looking for your kind reply, thanks very much!

Dimensions not matching?

Hi Edward,

I'm trying to reproduce GRAM results using MIMIC-III data.
If I understand correctly, there are 4894 medical codes used to represent patient visits. So the G matrix (from the paper) has to be of size 4894 x 128 (embedding dimension). However, there are no matrices of that size stored as a result of running gram.py.

Am I missing something or am I supposed to be deriving the G matrix with the help of other stored files? I tried to do this too but the dimensions just don't seem to be matching. Any help will be highly appreciated.

Thanks!

error running gram.py

Hello,

I followed all the previous steps, but when I ran gram.py I get the following error

Traceback (most recent call last):
File "gram.py", line 406, in
numAncestors = get_rootCode(args.tree_file+'.level2.pk') - inputDimSize + 1
File "gram.py", line 397, in get_rootCode
rootCode = tree.values()[0][1]
IndexError: list index out of range

Do you have any idea why it's showing that? Any help would be really appreciated.

Thanks.

Description of arguments

Hello Edward,
Can you give a description of the format of the arguments:

if __name__ == '__main__':
	seqFile = sys.argv[1]
	treeFile = sys.argv[2]
	labelFile = sys.argv[3]
	outPath = sys.argv[4]

Label

Hi,I have discovered that your training label and training feature are the same.Your purpose is to predict the code of the next visit , so I suppose the training label should be one time later than the training feature.

Domain knowledge graph issue?

Dear Choi,
I do love your concept of graph-based attention model for healthcare. And I am trying to do some works around it.
First thing first, I do get stuck with the graph generated. I tried to using MIMIC as a demo, but the generated graph shows, well, perhaps very detailed example. One patient that has Cancer ICD9 code was mapped to infectious and parasitic diseases. PID 124, old types 186, new types 50, ICD9: V1011 and 101. Due my limited knowledge, I dont think this may correct, and did I miss anything?
Further let me know if you cant reproduce it.
Secondly, I moved your implementation to Tensorflow, it run successfully, maybe later I will do a pull request for your check the codes.
Looking forward to hearing from you.
Regards,
Shen

How to calculate accuracy@20 in each frequency group?

Hi,Choi:
I have some problems about how to calculate the accuracy@k score of each frequency group. I don't know which of the following two is right: 1.For each frequency group, the top20 score are selected to compare with the real label,and then calculate accuracy@20 individually. 2.Select the in the top 20 index of all labels,and then calculate which group the 20 indexs belong to,and compare with the labels. I hope you can help me if you know.

Many thanks,
Oldpants

function arguments

I'm wondering why this code snippet for index in random.sample(range(n_batches) , n_batches) in the train_GRAM function in gram.py is passed two arguments - range(n_batches) and n_batches, instead of max. one from np.random.sample documentation. Its not working for me at least. Any comment will be highly appreciated.

Is the one-prediction-per-sequence variant still forthcoming?

Or, would it be possible to give a rough outline of where the major changes would need to be made in the current code to convert it?

Thanks

Access to the Hyper-parameter tuning document

May we have access to this document by any chance?

gram/README.md

Line 104 in 28cb665

 This [document](http://www.cc.gatech.edu/~echoi48/docs/gram_hyperparamters.pdf) provides the details regarding how we conducted the hyper-parameter tuning for all models used in the paper. 

Low-frequency labels hard to predict

Hello Dr. Choi,

Thanks for the nice work. I generated the CCS single-level labels as the target, and used your code to predict them. All hyperparameters are set according to the appendix. I group labels to five groups according to their frequencies (first rank all labels by their frequencies, and then equally divide them into five groups). But my results have some differences from those in the paper. I got [0, 0.01835, 0.0811, 0.3042, 0.8263] accuracies for the five groups, respectively. I noticed that I got higher accuracies for high-frequency labels, but cannot match the paper's accuracies for labels with frequency percentile [0-60]. Is there anything I have done wrongly? Furthermore, I found the frequency of labels in the first group (rarest) is only 0.16% out of all labels' frequencies (163/96677). I am wondering is this the correct way to divide to five groups?

Thanks,
Muhan

query about "def padMatrix"

Hi! I wonder that what is the definition of "mask" in function def padMatrix(...)?

Looking for your kind reply, thanks very much!

gradient with Theano

Thanks for your paper about Medical Prediction: GRAM.
I'm using your code to learn about how it works. But it has a problem as follow.

WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions. initializing parameters loading data building models OrderedDict([('w', w), ('w_tilde', w_tilde), ('b', b), ('b_tilde', b_tilde)]) Traceback (most recent call last): File "glove.py", line 163, in <module> train_glove(infile, inputSize=inputDimSize, batchSize=batchSize, dimensionSize=embDimSize, maxEpochs=maxEpochs, outfile=outfile) File "glove.py", line 119, in train_glove grads = T.grad(cost, wrt=tparams.values) File "/usr/local/lib/python3.8/dist-packages/theano/gradient.py", line 501, in grad raise TypeError("Expected Variable, got " + str(elem) + TypeError: Expected Variable, got <built-in method values of collections.OrderedDict object at 0x7f778bf1c1c0> of type <class 'builtin_function_or_method'>

That is the result when I run glove.py. I don't know the reason why it happens.
Thank you for your attention.

Le Ngoc Duc.

comparison between med2vec and gram

Hello Ed,

Nice work!
I didn't pay much attention to this paper at the beginning since you mentioned in the paper this method works well when the dataset is small. So I though Med2vec will give us a better performance when we have a large dataset.

However, now I look closer to the paper, it seems that GRAM will have a better performance than Med2vec and non-negative skip-gram as the t-SNE scatterplot for GRAM looks much better (dots are separated) compare to the other 2 methods.

On the other hand, since medical vector trained by GRAM is aligned well with the given knowledge DAG, which is made by human and might not be good. As you mentioned in the Med2Vec: "the degree of conformity of the code representations to the groupers does not neces-sarily indicate how well the code representations capture the hidden relationships"

I wonder how will you compare these 2 (or 3 if you count non-negative skip-gram) vector learning methods if given a large enough dataset?

Thanks!
xianlong

Thanks a lot!

mp2893 / gram Goto Github PK

gram's People

Contributors

Stargazers

Watchers

Forkers

gram's Issues

Recommend Projects

Recommend Topics

Recommend Org