deepgraphlearning / recommendersystems Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
请问为什么您在训练集中将batchsize的值设为200,在测试集中却设置为1?
代码在test.py的第81行,内容为args.batch_size = 1。
hi, each of session id will use the all padding data, if it will cause information to travel through?
Hi, I'm trying to test the result in the paper Session-Based Social Recommendation on the Delicious datasets . From the paper and datasets, I understand that you consider each session is a sequence of tags a user has assigned to a bookmark and all the tagging actions for that bookmark will have the same timestamp according to the datasets.
So my question is that when you give the time_id for each session (as in the preprocess_DoubanMovie.py file), will two sessions have the same id only if their timestamps have exactly same date and time ?
E.g: A session with timestamp '01/06/2020 2:30:00 pm' and another session with timestamp '01/06/2020 2:30:01 pm' will have different time_id.
由于会话划分时只使用会话ID进行排序,一个会话内物品的顺序是不是乱序的?
data = data.sort_values(by=['TimeId']).groupby('SessionId')['ItemId'].apply(list).to_dict()
I am working on my own data -purchase history of financial production.
So I have data as who, with features, purchased which products, with features and when.
However, I have found that the metric of the model is AUC -- and I, as expected, get the following error
ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.
Since all my Ys are 1s -as I only have who purchased (clicked) which and there is no 0 value ( who didn't purchase which).
As other CTR prediction models also use AUC and log loss, I assume that there must be a way to use AUC for such a dataset.
Do you have any idea how to solve this issue?
That would be really helpful!
when cnt_line=23000000,the 80 line has a indexError:list index out of range?
thank you for reply
在SocialRec中,samples1和samples2的默认值分别为10和5。如果我没理解错的话samples1代表第二层邻居的数量,samples2代表第一层邻居的数量,这样的话第二层邻居的数量就比第一层邻居的数量还多,与经验不符。而且您的论文中也写到“The neighborhood sample sizes are empirically set to 10 and 15 in the first and second convolutional layers”,请问是少写了一个1吗?
请问,Gowalla数据集是如何处理和划分的?
resolved long time ago!!!!
In multi-head attention, there is a relu after queries, keys, and values. Is this a correct implementation? The paper did not mention the relu in Eq. 5. Besides, it seems that the relu will make the attention matrix always positive.
# Linear projections
Q = tf.layers.dense(queries, num_units, activation=tf.nn.relu)
K = tf.layers.dense(keys, num_units, activation=tf.nn.relu)
V = tf.layers.dense(values, num_units, activation=tf.nn.relu)```
May I know if you have published your code for the paper: Ekar: An Explainable Method for Knowledge Aware
Recommendation?
Thanks!
After running the model, I am wondering if weights of attention and their visualization can be done as the attached file displays from the original paper.
There are many examples for extracting and visualization for seq2seq but couldn't really find one for feature explanation.
Is there any ideas/code that can be used for attention visualization of meaningful features? ?
While I was looking through the code, I noticed that in neigh_samplers.py, users can sample themselves as the second-order neighbors because when we sample 10 neighbors of the first-order neighbor, the user is also included.
adj = self.adj_info[node, :]
neighbors = []
for neighbor in adj:
if first_or_second == 'second':
if self.visible_time[neighbor] <= timeid:
neighbors.append(neighbor)
I was wondering is this intended or a logic flaw. Thank you!
local_features中的两层initial_state_layer1和initial_state_layer2如何理解?
是否已经有Pytorch版本,可否请大佬分享一下,邮箱[email protected],感激不敬!
大佬,KDD12数据集数据集现在下不到了,可以分享下吗,非常感谢~邮箱:[email protected]
请问这些框架结构要求的数据格式是什么样的呢?
@Songweiping ,Hello,I am very interested in your work(AutoInt) at CIKM'19,but I had some doubts when I reproduced the experiments of the paper.
As we know, Criteo has its own bench mark, including two csv files(train.csv & test.csv).But the preprocess of AutoInt splits the file(train_samples.txt) to get the train data,valid data and test data.
I'm wondering how to transform the original Dataset(train.csv & test.csv) into a Dataset that the data_process code can handle.
Could you post your code for converting original csv files(train.csv & test.csv) into the whole txt file(the data used in your paper, and Format is similar to the sample file train_example.txt)?
Could you please share us all the parameters used? Thanks!
作者,您好:
我目前在做动态网络表示学习,请问您可提供一下你实验数据吗?(论文里用的豆瓣原始数据就好,未被处理的,非常感谢,我的邮箱是[email protected])
I am very impressed about the usefulness of AutoInt to my personal project.
I'm wondering if there is any handy way of converting this AutoInt to top-n recommendaiton?
Would it be possible?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.