Comments (2)
The short answer is yes, but there are two things to keep in mind.
-
In some cases user histories can be considered as sequences, but in other cases not. This depends on the domain, but also on what type of events you use. For example, purchase histories in general e-commerce (e.g, electronics) rarely exhibit sequential patterns, because the purchases are not really related directly (e.g. the phone I buy today is very loosely connected to the laptop I bought 6 months ago; those are two mostly unrelated problems from the user's perspective). On the other hand, if you take the item page view events of the same store, you'll probably see sequential patterns, because while I was looking for the laptop (or the phone) I went over a sequential process of finding the best one for me, looked at different options, changed my preferences slightly during the process, etc. Also, the events were close to each other in time so it is more likely that the existing sequential behaviour can be observed. But this is not only about the timeframe, regularity can be another factor. If you have experience with the domain from which you have the data, you can probably decide if the observable user behaviour - given the event type, regularity, resolution, etc - can be considered sequential. If you are unsure, one thing you can do is to separately run a hyperparameter optimization with GRU4Rec and "FFN4Rec" from here (which is basically the same algorithm, but with the GRU layer replaced with a feedforward network). If you get very similar final evaluation scores than there is probably no sequentiality in your data.
-
If your user histories are long, the BPTT version of GRU4Rec can probably give you better results. It is not in the repo at the moment. (The base version should still work.)
-
If your user's events can (and usually do) happen closely after other (e.g. item page views), you can sessionize user histories. For example, you can say that if more than 30-60 minutes pass between two subsequent events of the same user, you consider those to be different sessions. You can look at how we sessionize our data here (look for the
*_preproc.py
scripts). -
Even if you use user histories, your train/test split should be time based. Unfortunately, the public version doesn't really support starting your prediction from a non-zero hidden state, which is something you'd probably want in this case (i.e. training up to time T and predicting events after time T for the same users, that requires moving the hidden state to time T).
from gru4rec.
Thanks a lot for your detailed and helpful answer. My data is on e-commerce domain. It is transactional data.
from gru4rec.
Related Issues (20)
- About training time HOT 1
- Is it possible to output the embedding of user/session and item vectors? HOT 1
- NOT RNN MODEL HOT 2
- Additional Negative Sampling: Conditional Statement Logic Error HOT 1
- generate_samples function call in gru4rec.py HOT 2
- BPR loss implementation question
- Fit function in gru4rec.py missing data sort HOT 1
- predict_next_batch not considering other products in the same session HOT 2
- (Question) - How to use all items in a session for prediction? HOT 2
- No hidden state reset in get_metrics HOT 4
- Where is the data file ?
- theano error HOT 2
- Can you make a brief explaination on how you calculate recall ? HOT 2
- Incremental training (retrain) support removed
- ValueError: Input dimension mis-match. (input[2].shape[0] = 2080, input[3].shape[0] = 32)
- cuda error
- GFF code
- Testing Error:: start = offset_sessions[iters] IndexError: index 2 is out of bounds for axis 0 with size 2
- Evaluating baselines
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gru4rec.