Giter Club home page Giter Club logo

dsin's Introduction

Deep Session Interest Network for Click-Through Rate Prediction

Experiment code on Advertising Dataset of paper Deep Session Interest Network for Click-Through Rate Prediction(https://arxiv.org/abs/1905.06482)

Yufei Feng , Fuyu Lv, Weichen Shen and Menghan Wang and Fei Sun and Yu Zhu and Keping Yang.

In Proceedings of 28th International Joint Conference on Artificial Intelligence (IJCAI 2019)


Operating environment

please use pip install -r requirements.txt to setup the operating environment in python3.6.


Download dataset and preprocess

Download dataset

  1. Download Dataset Ad Display/Click Data on Taobao.com
  2. Extract the files into the raw_data directory

Data preprocessing

  1. run 0_gen_sampled_data.py, sample the data by user
  2. run 1_gen_sessions.py, generate historical session sequence for each user

Training and Evaluation

Train DIN model

  1. run 2_gen_din_input.py,generate input data
  2. run train_din.py

Train DIEN model

  1. run 2_gen_dien_input.py,generate input data(It may take a long time to sample negative samples.)
  2. run train_dien.py

Train DSIN model

  1. run 2_gen_dsin_input.py,generate input data
  2. run train_dsin.py

    The loss of DSIN with bias_encoding=True may be NaN sometimes on Advertising Dataset and it remains a confusing problem since it never occurs in the production environment.We will work on it and also appreciate your help.

License

This project is licensed under the terms of the Apache-2 license. See LICENSE for additional details.

dsin's People

Contributors

shenweichen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsin's Issues

train_din error

WARNING:tensorflow:
The following Variables were used a Lambda layer's call (lambda), but
are not present in its tracked objects:
<tf.Variable 'attention_sequence_pooling_layer/local_activation_unit/kernel:0' shape=(16, 1) dtype=float32>
<tf.Variable 'attention_sequence_pooling_layer/local_activation_unit/bias:0' shape=(1,) dtype=float32>
It is possible that this is intended behavior, but it is more likely
an omission. This is a strong indication that this layer should be
formulated as a subclassed Layer rather than a Lambda layer.
WARNING:tensorflow:
The following Variables were used a Lambda layer's call (lambda), but
are not present in its tracked objects:
<tf.Variable 'attention_sequence_pooling_layer/local_activation_unit/kernel:0' shape=(16, 1) dtype=float32>
<tf.Variable 'attention_sequence_pooling_layer/local_activation_unit/bias:0' shape=(1,) dtype=float32>
It is possible that this is intended behavior, but it is more likely
an omission. This is a strong indication that this layer should be
formulated as a subclassed Layer rather than a Lambda layer.
Traceback (most recent call last):
File "/home/lyw/PycharmProjects/DSIN/code/train_din.py", line 49, in
att_hidden_size=(64, 16,))
File "/home/lyw/PycharmProjects/DSIN/code/models/din.py", line 87, in DIN
query_emb, keys_emb])
File "/home/lyw/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 922, in call
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/home/lyw/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 265, in wrapper
raise e.ag_error_metadata.to_exception(e)
AttributeError: in user code:

/home/lyw/anaconda3/envs/py36/lib/python3.6/site-packages/deepctr/layers/sequence.py:198 call  *
    outputs._uses_learning_phase = attention_score._uses_learning_phase

AttributeError: 'Tensor' object has no attribute '_uses_learning_phase'

关于序列构造的问题

你好,请问一下,论文中只是使用了用户的点击序列,那么用户的收藏序列或者购买序列是否也能加进去?如果多个序列的话应该是不能共享参数的?

#Parameter is zero for gru and attention layer?

Hello, I am running you baseline train_dien.py
add
` names = [weight.name for layer in model.layers for weight in layer.weights]
weights = model.get_weights()

for name, weight in zip(names, weights):
    print(name, weight.shape)`

to print variable in model and get
屏幕快照 2019-12-13 下午4 12 26
why there is no parameter for gru and attention layer? I user model.summary() also show no parameter for these two layer.

about your problem.

Hi,

The nan problem in training, I have no idea. Can you fix it? How many epochs is the model converge?

The nan problem in test, you can change the last line with
print("test LogLoss {0}".format( round(log_loss(test_label, np.array(pred_ans, dtype=np.float64), labels=(0, 1)), 4)))
maybe work.
btw, I wonder if preprocessed pkl will be released.

Thanks!

结果和原文有差距。

我将环境配置完毕后,在训练模型时抛出异常:InternalError: GPU sync failed。我觉得可能是一次性训练太多数据内存不够,于是我将DIN模型的BATCH_SIZE改成512,结果为0.629。将DSIN模型的BATCH_SIZE改为256,结果为5.63。百思不得解,也尝试过其它BATCH_SIZE,结果大同小异。求大神不吝赐教。

0_gen_sampled_data.py error

Hi, I have got an error while run 0_gen_sampled_data.py, the info as follows:
图片
My computer memory is 8GB.Do you know how to fix this, thanks!

The code file "train_dsin.py" fails to run.

Hi,
I have set up the enviroment and generated the input data for the DSIN model following the instruction. But I still failed to the train the model executing "train_dsin.py". The error messages are in the attachment.
Thank you for your time!
issue

关于DIEN 的negsample

您好,我想请问一下您在dsin原始论文当中与dien做对比的时候,dien是否有用到auxiliary loss,据我了解在dien的论文当中,这个auxiliary loss对结果影响非常大,用了可以提升好几个百分点。因此,您对比dien模型的时候是否有用到auxiliary loss,还是直接用没有auxiliary loss的dien与dien做对比?

关于论文中的self-ateention的疑惑

作者你好,以下有两个问题,希望您能解答.

第一点:在你论文中Qk应该指用户的第k个session矩阵, 但是在论文后面
image
这个Qk好像和原来意思不一样啊?能帮忙解释一下具体含义吗?

第二点:我看过Attention is All your need论文,我发现你论文中的参数矩阵应该是不同session之间是不同的.而你的论文中看起来所有参数矩阵W^Q都一样.

image

关于session的划分

您好,我想请问一下,在对历史行为做session的划分时,为什么只取时间戳为5.3号以后的数据?

run train_din error

hi, very lucky to study your work! And I have some questions about your code:
My TF version is 1.13.0 , cuda 10.0, I got :
AttributeError: 'Tensor' object has no attribute '_uses_learning_phase'
error like this.

So I pip install tf 1.4.0 according to your readme, and got :
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory
Seems like my cuda version cannot work with tf version, So do you know how to fix this case?

Thank you for your time !

关于代码中Transformer输入格式的疑惑

你好!我看代码里面Transformer输入是TR([tr_input[i], tr_input[i]]),但是其具体函数定义格式又是:def call(self, inputs, mask=None, training=None, **kwargs),其中的参数mask,要求是和tr_input[i]同shape或者是(batch_size, 1),不知道是不是我哪里有遗漏,谢谢!

报bug:DSIN特征穿越问题?

bug位置在文件2_gen_dsin_input.py第52行:last_sess_idx = i。当用户没有大于2个行为的session时,last_sess_idx = len(user_hist_session[user]) - 1,而不是等于0。导致第56行定位用户前4个session时,取的是最新的4个session,而非当前session前4个session。因此造成部分样本会使用到label时间之后的特征。
“11,1494226737,302383,430548_1007,1,0
11,1494226737,598359,430548_1007,1,0
11,1494226737,684497,430548_1007,1,0
11,1494419569,427488,430548_1007,1,0
11,1494419569,611964,430548_1007,1,0
11,1494419569,739213,430548_1007,1,0”,例如raw_sample中user_id=11,时间=1494226737的3个样本就是这种情况。

如何使用线上数据预测

hi,DSIN非常棒,还有测试数据,想请教个问题,模型训练后怎么对线上数据进行预测,模型预测是否click,预测数据需要如何组装

Transformer类中call()的问题

在deepctr库的Transformer类中有:
def call(self, inputs, mask=None, training=None, **kwargs):
if self.supports_masking:
queries, keys = inputs
query_masks, key_masks = mask
print('query_masks:',query_masks.get_shape())
query_masks = tf.cast(query_masks, tf.float32)
key_masks = tf.cast(key_masks, tf.float32)
我在复现DSIN的时候,如果是保持原配置不变的话复现是没有问题的。但是实际上我在将原输入划分成两个序列长为5的序列分别输入时,在执行query_masks, key_masks = mask这一步时就会提示'Nonetype' is not iterable。

python version [3.6]
tensorflow version [1.4.0]
deepctr version [0.4.1]

关于结果

想问一下您,在taobao ad数据集上,最终的AUC是能够得到论文中结果的嘛? 因为我按照您的这个流程走下来,使用了5%的ad数据集(受硬件环境限制),但是最后AUC只有0.5+, 我就很怀疑是我哪里没有弄对吗?

SingleFeat不存在

DSIN数据处理部分的 from deepctr.utils import SingleFeat 不存在,使用的是0.8的deepctr

关于dien,dsin实验结果的输入参数设置

从2_gen_dien_input.py 和 2_gen_dsin_input.py 这两个文件中,看到dien生成的数据是选择第1个session中最新的数据;而dsin生成的数据是把最新的session中最新数据放至sess_0。
所以,这里我有两个问题:
1.dsin和dien本身输入的数据不同,是否因为这个会导致实验结果有差异
2.dsin是将最新数据放在sess_0,因此在段间兴趣提取层,实际上是学习了一个从现在到过去的演化过程(当然,您论文中用的是bi-lstm,是双向学习)。那如果是换个顺序存储sess数据,那结果是否会有变化。
嗯,上面这些只是一些想法,还没有进行验证。
或许,您能帮忙解释一下,当初您为什么采用这种策略生成输入数据。

train_dsin error

Hi, I have got an error while run train_dsin.py, the info as follows:

Caused by op 'sparse_emb_14-brand/Gather_6', defined at:
File "train_dsin.py", line 52, in
att_embedding_size=1, bias_encoding=False)
File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 85, in DSIN
sess_feature_list, sess_max_count, bias_encoding=bias_encoding)
File "/home/dedong/pycharmProjects/Emb4RS/models/DSIN/code/_models/dsin.py", line 154, in sess_interest_division
sparse_fg_list, sess_feture_list, sess_feture_list)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/deepctr/input_embedding.py", line 145, in get_embedding_vec_list
embedding_vec_list.append(embedding_dictfeat_name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/topology.py", line 252, in call
output = super(Layer, self).call(inputs, **kwargs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/layers/embeddings.py", line 158, in call
out = K.gather(self.embeddings, inputs)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/backend.py", line 1351, in gather
return array_ops.gather(reference, indices)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2486, in gather
params, indices, validate_indices=validate_indices, name=name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1834, in gather
validate_indices=validate_indices, name=name)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/home/dedong/anaconda3/envs/tf1.4/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[0,0] = 136739 is not in [0, 79963)
[[Node: sparse_emb_14-brand/Gather_6 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](sparse_emb_14-brand/embeddings/read, sparse_emb_14-brand/Cast_6)]]

do you know how to fix this, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.