rn5l / session-rec Goto Github PK
View Code? Open in Web Editor NEWPython-based framework for building and evaluating session-based and session-aware recommender systems.
Python-based framework for building and evaluating session-based and session-aware recommender systems.
Hi, may I ask a question regarding the recommendation results? I tested multiple algo on different dataset, but besides the first recommendation, most of the score is very low even equals to 0. I saw the same issue in the data/rsc15/recommendations files. Is this the common fact for session based algo? Can you help me to understand? Thanks.
Using a config file based off of the example_session_aware_opt.yml. This error persists with many different models, but I am currently working with hgru4rec.py.
My data set is labeled with the default headers of [SessionId, ItemId, Time, UserId], and I have ensured there are no new new items or users in the valid/test set.
Any help would be appreciated.
- class: hgru4rec.hgru4rec.HGRU4Rec
params: {
n_epochs: 1,
session_layers: 10,
user_layers: 10,
loss: 'top1'
}
START evaluation of 10000 actions in 5000 sessions
eval process: 0 of 10000 actions: 0.0 % in 0.08836889266967773 s
Traceback (most recent call last):
File "run_config.py", line 62, in main
run_file(c)
File "run_config.py", line 169, in run_file
run_opt(conf)
File "run_config.py", line 417, in run_opt
run_opt_single(conf, i, globals)
File "run_config.py", line 277, in run_opt_single
eval_algorithm(train, test, k, a, evaluation, metrics, results, conf, iteration=iteration, out=False)
File "run_config.py", line 515, in eval_algorithm
results[key] = eval.evaluate_sessions(algorithm, metrics, test, train)
File "<arbitrary_path>/session-rec/evaluation/evaluation_user_based.py", line 170, in evaluate_sessions
m.add(preds, rest[0], for_item=current_item, session=current_session, position=position)
IndexError: index 0 is out of bounds for axis 0 with size 0
Dear Developers,
This issue related to vsknn.py
There is attribute self.min_time
of vsknn
class
I saw this attribute only three times. Is it ok if I remove the line of code
if time < self.min_time :
self.min_time = time
?
I couldn't catch that self.min_time
effected to the result. If it effects then could you explain it, please?
Thanks in advance!
I have processed Diginetica data by running python run_preprocessing.py conf/preprocess/session_based/window/diginetica.yml
and obtained several files such as train-item-views_train_full.0
. I renamed the folder from data/diginetica/slices
to data/diginetica/prepared
.
After preprocessing the data, I used conf/example_all_neural.yml
and configured it for diginetica data by changing yml
file to look as follows
type: single # single|window|opt
key: baselines_and_models #added to the csv names
evaluation: evaluation_user_based # evaluation_user_based
data:
name: diginetica #added in the end of the csv names
folder: data/diginetica/prepared/
prefix: train-item-views
type: csv # hdf|csv(default)
results:
folder: results/session-aware/diginetica/
After that I ran THEANO_FLAGS="device=cuda0,floatX=float32" CUDA_DEVICE_ORDER=PCI_BUS_ID python run_config.py conf/example_all_neural.yml
and I am getting an error that says
Using TensorFlow backend.
init
Traceback (most recent call last):
File "run_config.py", line 62, in main
run_file(c)
File "run_config.py", line 164, in run_file
run_single(conf)
File "run_config.py", line 213, in run_single
m.init(train)
UnboundLocalError: local variable 'train' referenced before assignment
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_config.py", line 919, in <module>
main(sys.argv[1], out=sys.argv[2] if len(sys.argv) > 2 else None)
File "run_config.py", line 73, in main
print('error for config ', list[0])
UnboundLocalError: local variable 'list' referenced before assignment
I have tested my srec37
environment on yoochoose
data and it trains without a problem. I am wondering what is wrong with diginetica
configuration, or I might have skipped or overlooked something.
Dear Developers,
This issue related to vsknn.py
What is the role of timestamp
parameter in predict_next()
method? Could you give a brief explanation? When timestamp=0
then it means timestamp
equals to the current time?
Thanks in advance!
Dear Developers,
This issue related to vsknn.py
I mentioned that self.last_ts=-1
Then there is a code:
if self.dwelling_time:
if self.last_ts > 0:
self.dwelling_times.append( timestamp - self.last_ts )
self.last_ts = timestamp
I think self.last_ts>0
is never true
. So is it ok if I remove the lines of code:
if self.last_ts > 0:
self.dwelling_times.append( timestamp - self.last_ts )
Will it affect to the result?
the code is so long, and your answer will help me. thank you.
Hi, thanks for making this repository and work public.
Which is the recommended way of "exporting" a model trained through the experiments? I mean, I was requested at my research to get a trained model, put a new input (like, another session) and see what is the recommended outcome.
I've noticed there's the pickle_model
label at the conf files, but most of them are labelled as not working for tensorflow models.
session-rec/conf/save/8tracks/window/window_8tracks_sgnn.yml
Lines 12 to 14 in 3b13883
Lines 570 to 588 in 3e6d76e
Is there a recommended way of dealing with this? Maybe something simple that would take less effort than fixing the pickle_model problem?
Hey,
Thanks for this amazing git! I pulled your docker file, but failed to run following your steps.
When I run ./dpython run_preprocesing.py conf/preprocess/window/rsc15.yml
, it is telling me that
python: can't open file 'run_preprocesing.py': [Errno 2] No such file or directory
When I run ./dpython run_config.py conf/in conf/out
or ./dpython run_config.py conf/example_next.yml
, it is telling me that
Traceback (most recent call last):
File "run_config.py", line 12, in
import yaml
ImportError: No module named 'yaml'
I am under session-rec folder. Do you have any idea what might cause these?
What kind of popularity recommendation is it?
Hi!
I still have some concerns about the parameter settings you choose. Would you please help me verify my working process? First, I download the Diginetica datasets from the official link. Next, I use the config file "conf/preprocess/session-based/diginetica.yml" and preprocess.py to generate processed datasets. Then, I use the config file "conf/save/diginetica/window/window_digi_sgnn.yml" and run_config.py to train the SR-GNN model. When I change the parameter settings (window_digi_sgnn.yml, line 35) from params: { lr: 0.0001, l2: 0.000007, lr_dc: 0.63, lr_dc_step: 3, nonhybrid: True, epoch_n: 10 } to params: { lr: 0.001, l2: 0.00001, lr_dc: 0.1, lr_dc_step: 3, nonhybrid: True, epoch_n: 10 }, the improvements can be significantly observed (both loss and HR & MRR results). I check this repository because SR-GNN has been proved more robust than the previous methods in all recent deep-learning literature on the Diginetica dataset. But, in your work Empirical analysis of session-based recommendation algorithms (UMUAI-2020)
, the reported HR@20 of SR-GNN is only 0.3638, much lower than their official report of 0.5073. I can understand that using the sub-dataset will degrade the performance of deep-learning-based methods, but the magnitude of degradation is too much.
Besides, I have noticed that the best neural method you claimed is GRU4REC (you use the official code of the improved version GRU4REC-TOPK (2018) instead of the original GRU4REC (2016)). However, according to the results in RepeatNet: A Repeat Aware Neural Recommendation Machine for Session-Based Recommendation (AAAI-2019)
, the HR@20 and MRR@20 are 0.4523 and 0.1416, but in your work Empirical analysis of session-based recommendation algorithms
, the HR@20 and MRR@20 are 0.4639 and 0.1644, even higher than results on a larger dataset. So it's weird!
I am not sure whether I have made some mistakes in my experiments, and I would appreciate it if I could get constructive responses from you.
Best regards.
Hi, I am running your code on Diginitica datasets. In concrete, I use sgnn.py (SR-GNN) as an example and find that I can get better results if I tune the parameter from lr=0.0001 l2=0.000007 lr_dc=0.63 to lr=0.001 l2=0.00001 lr_dc=0.1, just the same in the SR-GNN original paper.
The recall@20 and Mrr@20 can be improved from 30.70 15.41 to 45.52 15.91. If so, I wonder whether the conclusion reported in your paper that heuristic models get better results than such neural-based methods is proper.
Thank you and best regards.
Is the VSTAN algorithm implementation available in this repo? I could not find it.
The two path+file
and variables are missing in the run_preprocessing.py
.
session-rec/run_preprocessing.py
Line 78 in d5aca66
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.