Giter Club home page Giter Club logo

contentwise / contentwise-impressions Goto Github PK

View Code? Open in Web Editor NEW
12.0 5.0 2.0 3.3 MB

This repository contains the code used to run generate the data splits, run the hyperparameter tunings, and export the results presented in our article "ContentWise Impressions: An industrial dataset with impressions included"

Home Page: https://forms.office.com/Pages/ResponsePage.aspx?id=K3EXCvNtXUKAjjCd8ope6_zxBj9DRzpKnC4jkclZQupUQ0szOVhTQ1FCT0tZSEw1T1g0RzVBRVhSSC4u

License: GNU Affero General Public License v3.0

Python 81.30% Jupyter Notebook 1.66% Cython 17.04%
dataset collaborative-filtering python3 recommender-systems impressions interactions dask cython

contentwise-impressions's People

Contributors

fernandobperezm avatar mauriziofd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

contentwise-impressions's Issues

Inquiry about recommendation id

Hi there,

Thanks for sharing the dataset. I have read through the corresponding paper and got a few questions that I hope could be clarified.

In the impression with direct link dataset, each row is associated with a recommendation id. May I check if this id is the identifier of the recommendation algorithm used to generate the row or is it purely just treated as a key to merge the impression with direct link dataset with the interaction dataset? If the latter is the case, can I ask if there is anyway to distinguish impressions generated by different recommendation algorithms?

Thanks for in advance for your help. :)

Best wishes,
Helen

Cannot read the dataset

I receive an exception at line# 48 when trying to run <run_generate_splits.py> and read the dataset. The exception is raised from the apache parquet. I have installed pyarrow-0.17.1. Below is the stacktrace:

File "contentwise-impressions-master/run_generate_splits.py", line 48, in
dataset.read_dataset()
File "contentwise-impressions-master\Utils\decorator.py", line 21, in timed
result = method(*args, **kwargs)
File "contentwise-impressions-master\Utils\dataset.py", line 229, in read_dataset
self._local_interactions_file_path)
File "contentwise-impressions-master\Utils\dataset.py", line 213, in _read_parquet
return ddf.read_parquet(path=local_file_path, engine="pyarrow")
File "C:\Program Files\Anaconda3\lib\site-packages\dask\dataframe\io\parquet.py", line 1154, in read_parquet
categories=categories, index=index, infer_divisions=infer_divisions)
File "C:\Program Files\Anaconda3\lib\site-packages\dask\dataframe\io\parquet.py", line 706, in _read_pyarrow
pf = piece.get_metadata(_open)
TypeError: get_metadata() takes 1 positional argument but 2 were given

Problem

Hello,
when i used this datase, i had a problem as follows.
The situation 'when a impressions_without_direct_link is presented, a user interacts with a item which not appear in the impressions ' is what situation??
is it contained in rec_id=-1 in interactions.csv?

Hope your reply, thank you very much!

Can I get Item content matrix?

Hello, Thanks for your amazing work!

I'm now checking the baseline you provided, before starting my research. And I found that 'ItemKNNRecommender.py' utilizes the item content matrix to compute similarity. But the paper and the dataset you published don't contain item information except for series length and type of the episodes. And there are no parts to make the ICM matrix in 'dataset.py'. My main questions are as follows.

  1. Are the results you reported on the paper as an Item KNN based on the user rating matrix?
  2. If not, how can I get the item content matrix you used to report results?

Once again, I'm really thankful for providing a great dataset!

Validation and test splits are mixed up.

It looks like a bug in dataset.py:

        self.URM = {
            "train": train_split,
            "test": validation_split,
            "validation": test_split
        }

validation_split has name "test" and test_split has name "validation".

Could We Use `*-direct-link` Data to Reconstruct the Recommendation Grid?

Hi ContentWise team,

Thanks for this great work! I have a question about reconstructing the grid of recommended items.

Since we have impressions-*direct-link datasets that contain row positions of recommended series lists. I wonder if it is possible to reconstruct the grid of recommendation items that user viewed before?

For example,

  1. Find [user1, rec1, row=1] from impressions-direct-link dataset (user identifier is retrieved from interactions);
  2. Find [user1, row={2,3,4,5}] from impressions-none-direct-link dataset;
  3. Combine them together as a 5-row grid of recommendations where user1 interacted with the first row only.

Looking forward to hearing your suggestions about this idea. Is it possible (since we don't know the timestamps in impressions-none-direct-link)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.