haojihu / sets2sets Goto Github PK

Sequential sets to sequential sets learning

License: Apache License 2.0

Python 100.00%

deep-learning sequential-data structured-data sets2sets next-basket-recommendation clinical-decision-support decision-support-system structured-prediction sequential-sets-to-sequential-sets temporal-sets-prediction

sets2sets's People

Contributors

Stargazers

Watchers

Forkers

zhenql mindis yuanyuansiyuan taolian shuyangli94 fengyujuan kiminh godofrap esperidam

sets2sets's Issues

about the softmax function

When predicting the scores for every item in the set, you use softmax to normarlize the vector, but it will result in sum 1.
In the multi-label classification, should we use the sigmoid function instead?
Actually, in my own multi-label classification experiment, i use the softmax on my output scores, the performance drops than no softmax.

Details about the OPTUM dataset

can you give more details about the OPTUM dataset used in this paper?
i wan t to know where can i get this dataset. Thank you very much.

about the training process

It semms that when calculating the WMSE loss, we calculate the mse between a softmax probability vector generated by decoder and the groundtruth multi-hot vector.
When testing, we choose top-k to get the multi-hot prediction vector.
Have you ever tried when training, we also choose the topk from o(vi) generated by the decoder, and then calculate the distance between two multi-hot vector?
In this way, the operations of training and testing is consistent.

Waiting for your response, thank you!

Bugs occurred in other datasets

I get a bug in T-mall datasets called:

(base) wzk@ddst:~/work/Sets2Sets$ python Sets2Sets.py ./data/alibaba_history.csv ./data/alibaba_future.csv 1 2 1
start dictionary generation...
{'MATERIAL_NUMBER': 9531}
# dimensions of final vector: 9531 | 2962
finish dictionary generation*****
num of vectors having entries more than 1: 16462
num of vectors having entries more than 1: 15275
Traceback (most recent call last):
  File "Sets2Sets.py", line 990, in <module>
    main(sys.argv)
  File "Sets2Sets.py", line 955, in main
    codes_freq = get_codes_frequency_no_vector(data_chunk[past_chunk],input_size,data_chunk[future_chunk].keys())
  File "Sets2Sets.py", line 935, in get_codes_frequency_no_vector
    for idx in X[pid]:
KeyError: '371250'

Have anyone met this before? I'd be really appreciated if anyone can help.

About the two parts of loss function

Hi, in your loss function, it contains the WSME and PSE parts, have you ever done the experiments about the single part? Can i just use the WSME part to do the multi-label classification task?
Waiting for your response, thank you!

haojihu / sets2sets Goto Github PK

sets2sets's People

Contributors

Stargazers

Watchers

Forkers

sets2sets's Issues

about the softmax function

Details about the OPTUM dataset

about the training process

Bugs occurred in other datasets

About the two parts of loss function

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent