Giter Club home page Giter Club logo

Comments (2)

hidasib avatar hidasib commented on June 24, 2024 1

The short answer is yes, but there are two things to keep in mind.

  1. In some cases user histories can be considered as sequences, but in other cases not. This depends on the domain, but also on what type of events you use. For example, purchase histories in general e-commerce (e.g, electronics) rarely exhibit sequential patterns, because the purchases are not really related directly (e.g. the phone I buy today is very loosely connected to the laptop I bought 6 months ago; those are two mostly unrelated problems from the user's perspective). On the other hand, if you take the item page view events of the same store, you'll probably see sequential patterns, because while I was looking for the laptop (or the phone) I went over a sequential process of finding the best one for me, looked at different options, changed my preferences slightly during the process, etc. Also, the events were close to each other in time so it is more likely that the existing sequential behaviour can be observed. But this is not only about the timeframe, regularity can be another factor. If you have experience with the domain from which you have the data, you can probably decide if the observable user behaviour - given the event type, regularity, resolution, etc - can be considered sequential. If you are unsure, one thing you can do is to separately run a hyperparameter optimization with GRU4Rec and "FFN4Rec" from here (which is basically the same algorithm, but with the GRU layer replaced with a feedforward network). If you get very similar final evaluation scores than there is probably no sequentiality in your data.

  2. If your user histories are long, the BPTT version of GRU4Rec can probably give you better results. It is not in the repo at the moment. (The base version should still work.)

  • If your user's events can (and usually do) happen closely after other (e.g. item page views), you can sessionize user histories. For example, you can say that if more than 30-60 minutes pass between two subsequent events of the same user, you consider those to be different sessions. You can look at how we sessionize our data here (look for the *_preproc.py scripts).

  • Even if you use user histories, your train/test split should be time based. Unfortunately, the public version doesn't really support starting your prediction from a non-zero hidden state, which is something you'd probably want in this case (i.e. training up to time T and predicting events after time T for the same users, that requires moving the hidden state to time T).

from gru4rec.

halilergul1 avatar halilergul1 commented on June 24, 2024

Thanks a lot for your detailed and helpful answer. My data is on e-commerce domain. It is transactional data.

from gru4rec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.