hidasib / gru4rec Goto Github PK

GRU4Rec is the original Theano implementation of the algorithm in "Session-based Recommendations with Recurrent Neural Networks" paper, published at ICLR 2016 and its follow-up "Recurrent Neural Networks with Top-k Gains for Session-based Recommendations". The code is optimized for execution on the GPU.

License: Other

Python 100.00%

gru4rec's Issues

No hidden state reset in get_metrics

In current version of the code, it seems that there is no reset of hidden states for ending sessions in test batch.

I modified on my local env the get_metrics function to the following :

def get_metrics(model, args, train_generator_map, recall_k=10, mrr_k=10):
    test_dataset = SessionDataset(args.test_data, itemmap=train_generator_map)
    test_generator = SessionDataLoader(test_dataset, batch_size=args.batch_size)

    n = 0
    rec_sum = 0
    mrr_sum = 0

    with tqdm(total=args.test_samples_qty) as pbar:
        for feat, label, mask in test_generator:
            **real_mask = np.ones((args.batch_size, 1))
            for elt in mask:
                real_mask[elt, :] = 0

            hidden_states = get_states(model)[0]
            hidden_states = np.multiply(real_mask, hidden_states)
            hidden_states = np.array(hidden_states, dtype=np.float32)
            model.layers[1].reset_states(hidden_states)**

            target_oh = to_categorical(label, num_classes=args.train_n_items)
            input_oh = to_categorical(feat,  num_classes=args.train_n_items)
            input_oh = np.expand_dims(input_oh, axis=1)
            
            pred = model.predict(input_oh, batch_size=args.batch_size)

            for row_idx in range(feat.shape[0]):
                pred_row = pred[row_idx] 
                label_row = target_oh[row_idx]

                rec_idx = pred_row.argsort()[-recall_k:][::-1]
                mrr_idx = pred_row.argsort()[-mrr_k:][::-1]
                tru_idx = label_row.argsort()[-1:][::-1]

                n += 1

                if tru_idx[0] in rec_idx:
                    rec_sum += 1

                if tru_idx[0] in mrr_idx:
                    mrr_sum += 1/int((np.where(mrr_idx == tru_idx[0])[0]+1))
            
            pbar.set_description("Evaluating model")
            pbar.update(test_generator.done_sessions_counter)

    recall = rec_sum/n
    mrr = mrr_sum/n
    return (recall, recall_k), (mrr, mrr_k)

Can someone confirm that there was an error, and that my correction worked ?

Where is the data file ?

Can someone please mention which data is used and provide the link for it ? Also which file to exactly run to get the desired output ?

Model Update Bug of BPR baseline

In the update function of BPR baseline, you forget to update the bias parameters. That's very important since the defined relevance function includes this term. Please double check your code. Maybe it's the reason why the performance of BPR shown in your paper is so bad.

ValueError: Input dimension mis-match. (input[2].shape[0] = 2080, input[3].shape[0] = 32)

I disable the custom GPU optimization following the instructions in README.md.

https://github.com/hidasib/GRU4Rec#executing-on-cpu

However, this will trigger the error.

  File "./models/theano/gru4rec\model\gru4rec.py", line 617, in fit
    cost = train_function(in_idx, y, len(iters), reset.reshape(len(reset), 1))
  File "A:\env\sess\lib\site-packages\theano\compile\function_module.py", line 917, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "A:\env\sess\lib\site-packages\theano\gof\link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "A:\env\sess\lib\site-packages\six.py", line 702, in reraise
    raise value.with_traceback(tb)
  File "A:\env\sess\lib\site-packages\theano\compile\function_module.py", line 903, in __call__
    self.fn() if output_subset is None else\
ValueError: Input dimension mis-match. (input[2].shape[0] = 2080, input[3].shape[0] = 32)
Apply node that caused the error: Elemwise{Composite{(i0 + Switch(i1, i2, i3))}}[(0, 2)](TensorConstant{(1,) of 1e-24}, Elemwise{gt,no_inplace}.0, Sum{axis=[0], acc_dtype=float64}.0, Sum{axis=[1], acc_dtype=float64}.0)
Toposort index: 61
Inputs types: [TensorType(float64, (True,)), TensorType(bool, (True,)), TensorType(float64, vector), TensorType(float64, vector)]
Inputs shapes: [(1,), (1,), (2080,), (32,)]
Inputs strides: [(8,), (1,), (8,), (8,)]
Inputs values: [array([1.e-24]), array([False]), 'not shown', 'not shown']
Outputs clients: [[InplaceDimShuffle{x,0}(Elemwise{Composite{(i0 + Switch(i1, i2, i3))}}[(0, 2)].0), InplaceDimShuffle{0,x}(Elemwise{Composite{(i0 + Switch(i1, i2, i3))}}[(0, 2)].0), Elemwise{Log}[(0, 0)](Elemwise{Composite{(i0 + Switch(i1, i2, i3))}}[(0, 2)].0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Process finished with exit code 1

Non-session based custom dataset

Hi,

I want to try to train this model on my custom transactional dataset. I wonder whether [GRU4Rec] works intrinsically with session-based data. I have only item_id, user_id and time dimensions. Can I treat users as sessions or should I create a session dimension to run this model? Thanks in advance.

Code not working on a fresh install

The code doesn't run on a fresh install.

Steps followed:

python preprocess.py (used python3, runs successfully)
python run_rsc15.py (tried using both python2 and 3 fails in both)

the output of the 2nd command:

Training GRU4Rec with 100 hidden units
0: NaN error!

Testing Error:: start = offset_sessions[iters] IndexError: index 2 is out of bounds for axis 0 with size 2

Hi @hidasib

I have trained the model on a different dataset using CPU. I followed the CPU configuration instructions and everything worked fine. Training was completed successfully but after loading the testing data, the model could not evaluate and I got this error:

data = pd.read_csv(fname, sep='\t', usecols=[gru.session_key, gru.item_key, gru.time_key], dtype={gru.session_key:'int32', gru.item_key:np.str})
Starting evaluation (cut-off=1, using standard mode for tiebreaking)
Measuring Recall@1 and MRR@1
Traceback (most recent call last):
  File "run.py", line 129, in <module>
    res = evaluation.evaluate_gpu(gru, test_data, items, batch_size=100, cut_off=c, mode=args.eval_type)
  File "/Users/xxxxxxxx/GRU4REC/GRU4Rec/evaluation.py", line 80, in evaluate_gpu
    start = offset_sessions[iters]
IndexError: index 2 is out of bounds for axis 0 with size 2

What could be causing this problem and how can I fix it?

Thanks

GFF code

Hello, is your "General factorization framework for context-aware recommendations" paper code open source?Could you please share with me?

hello,i have some questions about the code

in gru4rec , generate_samples() function is not definition. I searched all the files and didn't find any other information about it.

Is it possible to output the embedding of user/session and item vectors?

After training, is it possible to output the embedding of session embedding and item embedding? Or could you tell me how to get those emebddings?

In traing data the items are indexed, but in test data are not, is there anything wrong?

in traing data:
self.predict = None self.error_during_train = False itemids = data[self.item_key].unique() if not retrain: self.n_items = len(itemids) self.itemidmap = pd.Series(data=np.arange(self.n_items), index=itemids) data = pd.merge(data, pd.DataFrame({self.item_key: itemids, 'ItemIdx': self.itemidmap[itemids].values}), on=self.item_key, how='inner') offset_sessions = self.init(data)

and use "ItemIdx":
out_idx = data.ItemIdx.values[start]

but in test data(use "item_key"):
in_idx[valid_mask] = test_data[item_key].values[start_valid]

Is there anything wrong?

Evaluating baselines

Hello @hidasib

Is there a way to run the baselines.py and compute Recall/MRR from the terminal. Could you specify the command.

Thanks

is it possible to use some user or item embeddings with this library?

The first paper says that you also tried adding additional embedding layer, but 1-Hot encoding resulted in better performance.

I wonder what kind of embedding you used for that experiment? I'm interested in 2 types of embeddings:
1- something similar to what LightFm uses for representing items by considering content-info about items to solve the cold-start problem for items instead of only representing items by their ids.
2- something similar to what TensorRec framework allows to implement which will be transforming original high dimensional vector of items to other linear or non-linear representations which will be of much lower dimensions and map similar items to similar points in the embedded space.

In particular, I'm wondering whether you had any memory/performance problems when dealing with those very big high-dimensional matrices as in the paper for video dataset, you had 330 thousand videos which will be a huge matrix when represented in 1-Hot encoding.

Thanks

code not faster on GPU

Hi,

Great article and code!
I find that the training is not faster on GPU (Titan X) vs CPU (Macbook Pro)
The train_function will take, for example, 0.003 s on GPU and 0.0015 on CPU.
Did you ever encounter this problem.

Thanks,
Massimo

cuda error

I run this command:

$python run.py ../GRU4Rec_TensorFlow_beauty/data/train.tsv -t ../GRU4Rec_TensorFlow_beauty/data/valid.tsv -m 1 5 10 20
-ps loss=bpr-max,final_act=elu-0.5,hidden_act=tanh,layers=100,adapt=adagrad,n_epochs=10,batch_size=32,dropout_p_embed=0.0,dropout_p_hidden=0.0

But I get this error:

WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Can not use cuDNN on context None: cannot compile with cuDNN. We got this error:
b'/tmp/try_flags_b5qyyrtx.c:4:10: fatal error: cudnn.h: No such file or directory\n #include <cudnn.h>\n ^~~~~~~~~\ncompilation terminated.\n'

Traceback (most recent call last):
File "run.py", line 122, in
gru.fit(data, sample_store=args.sample_store_size, store_type='gpu', ckpt_path=args.save_model)
File "/hdd/lhz/cll/baselines/GRU4Rec+_beauty/gru4rec.py", line 553, in fit
updates_st[self.ST] = gpu_searchsorted(P, X, dtype_int64=True).reshape((generate_length, self.n_sample))
File "/hdd/lhz/cll/baselines/GRU4Rec+_beauty/gpu_ops.py", line 43, in gpu_searchsorted
return cto.GpuBinarySearchSorted(dtype_int64=dtype_int64)(P, X)
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gof/op.py", line 615, in call
node = self.make_node(*inputs, **kwargs)
File "/hdd/lhz/cll/baselines/GRU4Rec+_beauty/custom_theano_ops.py", line 292, in make_node
d = as_gpuarray_variable(d, context_name=self.context_name)
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gpuarray/basic_ops.py", line 79, in as_gpuarray_variable
return copy_stack_trace(x, GpuFromHost(context_name)(x))
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gof/op.py", line 615, in call
node = self.make_node(*inputs, **kwargs)
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gpuarray/basic_ops.py", line 674, in make_node
dtype=x.dtype)()
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gpuarray/type.py", line 186, in init
get_context(self.context_name)
File "/home/anaconda3/envs/theano/lib/python3.6/site-packages/theano/gpuarray/type.py", line 104, in get_context
raise ContextNotDefined("context name %s not defined" % (name,))
theano.gpuarray.type.ContextNotDefined: context name None not defined

CUDA Version 10.1.105

Anyone could help?

clickstream session classification

So I have the data in the following form, I use char-rnn/word-rnn to predict the next click and classify if the user will convert or not. The question is what changes need to be made to incorporate to classify if a user given the user cicks is going to buy or defer. A quick rough brain dump is sufficient.

Session1,P19,P69,P71,P72,P24,Buy
Session2,P0,P6,P14,P10,P18,P32,P50,Defer
Session3,P7,P0,P26,P6,P33,Defer
Session4,P10,P6,P11,P12,Defer
Session5,P2,P10,P18,P32,Defer
Session6,P0,P10,P18,P32,P50,P37,P7,Buy
Session7,P10,P18,P32,P50,P37,Buy
Session8,P0,P33,P40,P7,P10,P18,Defer
Session9,P10,P18,P7,P0,P6,P33,P14,P5,P68,P32,P50,P37,Buy
Session10,P10,P7,P0,P18,P32,Defer
Session11,P10,P18,P32,P50,P37,Buy
Session12,P11,P7,P0,P10,P18,Defer
Session13,P0,P3,P39,P30,P26,P7,P36,P20,Defer
Session14,P0,P10,P18,P32,P50,P37,Buy
Session15,P6,P28,P26,P7,P8,Defer
Session16,P10,P7,P26,P5,P68,P82,P84,P37,Buy
Session17,P10,P18,P32,P50,P37,Buy
Session18,P7,P23,P34,P35,P37,Buy

Can you make a brief explaination on how you calculate recall ?

The way to calculate recall is Recall = TP/(TP + FN), but in the evaluation, I do not completely understand the way you calculate it, can you make a brief explanation on this? Thank you.

Get predictions with predict_next_batch

I'm trying to get prediction from the evaluated gru model. My first attempt is like

batch_size=100
iters = np.arange(batch_size).astype(np.int32)
in_idx = np.zeros(batch_size, dtype=np.int32)
predict_for_item_ids = None # no sampling
preds = gru.predict_next_batch(iters, in_idx, predict_for_item_ids, batch_size)
preds.fillna(0, inplace=True)

That is quite like the evaluation code when it is not sampling the data here:

out_idx = test_data.ItemId.values[start_valid+i+1]
            if sampled_items:
                uniq_out = np.unique(np.array(out_idx, dtype=np.int32))
                preds = pr.predict_next_batch(iters, in_idx, np.hstack([items, uniq_out[~np.in1d(uniq_out,items)]]), batch_size)
            else:
                preds = pr.predict_next_batch(iters, in_idx, None, batch_size) #TODO: Handling sampling?
            preds.fillna(0, inplace=True)

I'm not sure since the predict_next_batch function has this sign

    def predict_next_batch(self, session_ids, input_item_ids, predict_for_item_ids=None, batch=100):

So I need session_ids and input_item_ids from input dataset, right?

About training time

Sorry to disturb. I've used Titan Xp to run your code, but it needs four hours for each epoch, that is for the final results, I need to wait about two days or more. Du you know why? thanks.

How long does it take to run this program?

Hello, I am a beginner of python and Neural Network 。 I run this code in a 32 cores cpu without GPU.
"run_rsc15.py" has run amost 24 hours. So I want know how long it will take to run "python run_rsc15.py". by the way, memory of the server is 128G.
think you very much !

Initial Click data

The model proposed works once a person has sufficient click data. For training and evaluation, Recsys data was used.

Lets say one wants to deploy this method. Then how do we get the initial click data ?
The customers will need "some level of similar products" recommended before they start clicking and generate the data right?
How do we handle this cold-start of the entire model?

(Question) - How to use all items in a session for prediction?

I have a session (sequence) of a user (user_id)
sequence is a list of items( [1, 2, 3])
and user_id is an integer

I'd like to predict the next item of the session by using the all items in the session so far. I don't want to use only the last item in the session.

If I understand correctly from the documentation, in order to predict the next item of the session so far, I need to call predict_next_batch subsequently with the items in the session and return the last prediction as below;

    def _predict(self, sequence, user_id, item_ids=None):
        predictions = None
        for item in sequence:
            predictions = self.gru4rec.predict_next_batch([user_id], [item], item_ids, batch=1)
        return predictions.values[:, 0]

Is my understanding correct that the code below uses all the items in the sequence to produce final predictions?

Finally, which one do you think is better? Using all items in the session or only the last item of the session?

Thanks for sharing the code.

predict_next_batch not considering other products in the same session

I am trying to see if model's prediction consider different Items in same session.

I tried batch_size 5, with different session ids [0,1,2,3,4] as well same session ids [0,0,0,0,0] for ItemIds [A1, A2, A3, A4, B1] (shoe, shoe, shoe, shoe, shirt). But for both scenarios predict_next_batch gives same prediction for B1. I am assuming if the session Id is same for input products the model will consider them all in sequential fashion and the prediction will be effected by the combination of Items in the same session.

how to get start?

Do errors backpropagate Through Time?

Hi @hidasib ,

I have questions about GRU4Rec training. Since the input length in a mini-batch is set to 1, does it mean that there is no errors backpropagate through time as in standard RNN? If not, could you please help me out and explain the training procedure?

Thanks

BPR loss implementation question

Hi, I have a question about the mini-batch sampled BPR loss. The code is as follows:

def bpr(self, yhat, M):
    return T.cast(T.mean(-T.log(T.nnet.sigmoid(gpu_diag_wide(yhat).dimshuffle((0, 'x'))-yhat))), theano.config.floatX)

Do I understand correctly, that the score for the positive item is also considered as one of the negative scores? Positive score is not filtered from the negative scores list. Sigmoid of 0 difference between positive and negative here will be 0.5 and negative log contribution to loss will be 0.7 from the positive score itself.

How to save & load model?

Hi Balazs,

Thank you very much for sharing the great article and implementation. It’s extremely helpful.

I’m wondering if there’s a way to save the model so that I can load it later. I tried to use pickle.dumps(), np.save() and all failed. If it’s possible, could you please let me know how to save and load the model.

Thanks
Peggy

where to get the data?

rzai@rzai00:/prj/GRU4Rec/examples/rsc15$ python run_rsc15.py
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105)
Traceback (most recent call last):
File "run_rsc15.py", line 20, in
data = pd.read_csv(PATH_TO_TRAIN, sep='\t', dtype={'ItemId':np.int64})
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 470, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 246, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 562, in init
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 699, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1066, in init
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 350, in pandas.parser.TextReader.cinit (pandas/parser.c:3163)
File "pandas/parser.pyx", line 583, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5779)
IOError: File /path/to/rsc15_train_full.txt does not exist
rzai@rzai00:/prj/GRU4Rec/examples/rsc15$

theano error

I run this command:
$python run.py /path/to/training_data_file -t /path/to/test_data_file -m 1 5 10 20 -ps loss=bpr-max,final_act=elu-.5,hidden_act=tanh,layers=100,adapt=adagrad,n_epochs=10,batch_size=32,dropout_p_embed=0.0,dropout_p_hidden=0.0,learning_rate=0.2,momentum=0.3,n_sample=2048,sample_alpha=0.0,bpreg=1.0,constrained_embedding=False

But I get this error:

/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7.
warnings.warn("Your cuDNN version is more recent than "
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/_init _.py", line 227, in
use(config.device)
File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/_init _.py", line 214, in use
init_dev(device, preallocate=preallocate)
File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/_init _.py", line 117, in init_dev
context.cudnn_handle = dnn._make_handle(context)
File "/home/.../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gpuarray/dnn.py" , line 130, in _make_handle
"This can be a sign of a too old driver.", err)
RuntimeError: ('Error creating cudnn handle. This can be a sign of a too old driver.', 1)
SET loss TO bpr-max (type: <class 'str'>)
SET final_act TO elu-0.5 (type: <class 'str'>)
SET hidden_act TO tanh (type: <class 'str'>)
SET layers TO [100] (type: <class 'list'>)
SET adapt TO adagrad (type: <class 'str'>)
SET n_epochs TO 10 (type: <class 'int'>)
SET batch_size TO 32 (type: <class 'int'>)
SET dropout_p_embed TO 0.0 (type: <class 'float'>)
SET dropout_p_hidden TO 0.0 (type: <class 'float'>)
SET learning_rate TO 0.2 (type: <class 'float'>)
SET momentum TO 0.3 (type: <class 'float'>)
SET n_sample TO 2048 (type: <class 'int'>)
SET sample_alpha TO 0.0 (type: <class 'float'>)
SET bpreg TO 1.0 (type: <class 'float'>)
SET constrained_embedding TO False (type: <class 'bool'>)

Loading training data...
Loading data from TAB separated file: examples/rsc15/processed/rsc15_train_tr.txt
Started training
The dataframe is not sorted by SessionId, sorting now
Data is sorted in 46.12
Traceback (most recent call last):
File "run.py", line 109, in
gru.fit(data, sample_store=args.sample_store_size, store_type='gpu')
File "/home/../GRU4Rec/gru4rec.py", line 556, in fit
generate_samples = theano.function([], updates=updates_st)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function .py", line 317, in function
output_keys=output_keys)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/pfunc.py ", line 486, in pfunc
output_keys=output_keys)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function _module.py", line 1841, in orig_function
fn = m.create(defaults)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/compile/function _module.py", line 1715, in create
input_storage=input_storage_lists, storage_map=storage_map)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/link.py", li ne 699, in make_thunk
storage_map=storage_map)[:3]
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/vm.py", line 1091, in make_all
impl=impl))
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/op.py", line 955, in make_thunk
no_recycling)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/op.py", line 858, in make_c_thunk
output_storage=node_output_storage)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1217, in make_thunk
keep_lock=keep_lock)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1157, in compile
keep_lock=keep_lock)
File "/home/../anaconda2/envs/gru4rec/lib/python3.6/site-packages/theano/gof/cc.py", line 1641, in cthunk_factory
*(in_storage + out_storage + orphd))
RuntimeError: ('The following error happened while compiling the node', GpuBinarySearchSorted{context_name=None, dtype_int64=True}(GpuFromHost.0, GpuFromHost.0), '\n', 'GpuKernel_init error 3: nvrtcCompileProgram: NVRTC_ERROR_BUILTIN_OPERATION_FAILURE')

OS: Debian 4.9.110-3+deb9u4~deb8u1 (2018-08-24) x86_64 GNU/Linux
cudnn: 7.6
cuda: 9.2
theano: 1.0.4
pygpu: 0.7.6
libgpuarray: 0.7.6

Anyone could help?

Problems about the input of the model during training and testing.

Hello, @hidasib
I have read your paper but I am not sure about some details about the model's input at training and testing time. Could you please check my following descriptions of my understanding.

At training step, if the mini-batch size is 1, according to your figure, the first input vector of the model should be i_{1,1}, i_{2,1}, i_{3,1}, and the corresponding labels of these three timesteps should be i_{1,2}, i_{2,2}, i_{3,2}.
Then at testing time, for session 1, the input of the model should be i_{1,1}, i_{1,2}, i_{1,3}

Please let me know if I made any mistakes. Thank you very much in advance!

Questions about the implementation of optimizers

For example, in function rmsprop(), suppose sample_idx is None currently , the SharedVariable acc is defined first, followed by acc_new, then set updates[acc] = acc_new, but when calling RMSprop() next time which is not rmsprop(), for a same parameter, in function rmsprop() which is called by RMSprop(), the acc is reset first, then acc_new is computed. Shouldn't SharedVariable acc be defined first in init() function ?
Why using self.lmbd in parameter updating? Like line 387 in file gru4rec.py.
updates[p] = p * np.float32(1.0 - self.learning_rate * self.lmbd) - np.float32(self.learning_rate) * g

Fit function in gru4rec.py missing data sort

Thank you for sharing your great job. In gru4rec.py file, I think data should be sorted by session_key and item_key after line 508 to be aligned to offset_sessions.

Please add a license

It seems that there is right now no statement about any kind of license which is problematic when using the code. Could you please state a license, for instance the MIT license?

NOT RNN MODEL

The code only define a timestep of rnn, so gradient of timestep >=(t+2) cannot be propogated to timestep t ? which is more like a Markov Assumption (only timestep t and timestep t+1 have relations) rather than a rnn Assumption?

Additional Negative Sampling: Conditional Statement Logic Error

GRU4Rec/gru4rec.py

Line 554 in 0438f25

if self.n_sample:

GRU4Rec/gru4rec.py

Line 564 in 0438f25

else:

GRU4Rec/gru4rec.py

Line 566 in 0438f25

if self.n_sample:

The conditional logic of these three lines seems to be unreasonable. Is the branch starting at line 566 redundant?

if self.n_sample:
                            if sample_pointer == generate_length:
                                generate_samples()
                                sample_pointer = 0
                            sample_pointer += 1

Also, I did not see the definition of the function generate_samples.
Thanks.

generate_samples function call in gru4rec.py

gru4rec.py calls generate_samples (line 568), however the function is not defined in the project.

Predict_next not defined for GRU

Hi,

First of all, thank you very much for posting this implementation - it's been very helpful to work through.

I see that there are predict_next methods defined for all of the baseline models - Pop, Item KNN etc - but not one for the actual GRU itself (it has a predict_next_batch method, but I'm getting a bit confused trying to understand the batch evaluation, and thought I would fall back to the simpler case).

I searched the repo for a predict_next function attached to the GRU, but could not find one. Would you mind posting one, or talking me through how I might implement it?

Thanks very much,
Josh

About training time

Hi, I have a Tesla K80, and the given example has been running for more than 10 hours, and I want to know if it's normal and how much time does the training process usually take if there is a K80 gpu?

Saving a model

Hi, if I want save a model, should I care about the following objects ?
self.Wx, self.Wh, self.Wrz, self.Bh, self.H

thank you!

AttributeError: 'datetime.datetime' object has no attribute 'timestamp'

Please help me,
I try to run preprocess.py, but i met error
AttributeError: 'datetime.datetime' object has no attribute 'timestamp'

I try to change :
--->import datetime as dt
To:
--->from datetime import datetime as dt
But none Efficience
Anyone any idea?

Applying a user_based matrix factorization to session_based scenario

I saw you have adapted BPR-MF algorithm to the session_based case by considering each session as a new user during training. Also, at prediction your paper says you get the average of item-embeddings of all visited items of the test session and use them as user features. But, it also says that " In other words we average the similarities of the feature
vectors between a recommendable item and the items of the session so far". So I'm confused whether after you get that average item embedding vector for the test session you use it to find similar items from the set of training items, or you actually use that in place of user-embedding of that session (like P and Q matrices in matrix factorization context) and multiply that to item-embedding of each test item to get score of this session for each of those test items.

In particular, could you please explain what this line does:
https://github.com/hidasib/GRU4Rec/blob/master/baselines.py#L416

And, I'm also confused why you need input_item_id in addition to session_id in get_predictions method. Isn't it enough to have just the session_id (then we know all of its items) and the list of items to get the predictions for (predict_for_item_ids)?

Thanks

Strange CPU usage

Hi, I'm training the neural network using Recsys2015 dataset and I'm facing with an unexpected CPU utilization (see screenshot below):

It seems that CPU cores aren't properly used (only 1 core out of 16 is fully used, the others are mostly spending their time in kernel-related cycles -- red bars). Can you help me with that? I tested my Theano installation and everything works fine, Theano's test script uses all cores at almost 100%.

Thanks!

Incremental training (retrain) support removed

Hi Balázs. I saw GRU4Rec had in an earlier version the retrain option, to keep fine tuning a model (i.e., multiple fit() calls) with newer data . I remember that feature created additional output neurons for item ids not seen in the previous training.
Why was that option removed?
Is that because the training accuracy is not good under such training approach? Or just to make the implementation simpler?
I believe that retrain feature is important to avoid training runtime getting longer every day as new data becomes available.

Is the problem a sequence predict a value?

I am a little confused about what the problem here is. Are we trying to use all the previous clicks in a sequence to predict last click in that sequence? If so, how is it possible to calculate recall @ 20? If it is a classification problem, how so?

the training process do not use BPTT

the training process do not use BPTT
I have implement a version use BPTT, but recall drop to 0.43

hidasib / gru4rec Goto Github PK

gru4rec's Issues

Recommend Projects

Recommend Topics

Recommend Org