Giter Club home page Giter Club logo

kgcn's Introduction

KGCN

This repository is the implementation of KGCN (arXiv):

Knowledge Graph Convolutional Networks for Recommender Systems
Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, Minyi Guo.
In Proceedings of The 2019 Web Conference (WWW 2019)

KGCN is Knowledge Graph Convolutional Networks for recommender systems, which uses the technique of graph convolutional networks (GCN) to proces knowledge graphs for the purpose of recommendation.

Files in the folder

  • data/
    • movie/
      • item_index2entity_id.txt: the mapping from item indices in the raw rating file to entity IDs in the KG;
      • kg.txt: knowledge graph file;
    • music/
      • item_index2entity_id.txt: the mapping from item indices in the raw rating file to entity IDs in the KG;
      • kg.txt: knowledge graph file;
      • user_artists.dat: raw rating file of Last.FM;
  • src/: implementations of KGCN.

Running the code

  • Movie
    (The raw rating file of MovieLens-20M is too large to be contained in this repository. Download the dataset first.)
    $ wget http://files.grouplens.org/datasets/movielens/ml-20m.zip
    $ unzip ml-20m.zip
    $ mv ml-20m/ratings.csv data/movie/
    $ cd src
    $ python preprocess.py -d movie
    
  • Music
    • $ cd src
      $ python preprocess.py -d music
      
    • open src/main.py file;

    • comment the code blocks of parameter settings for MovieLens-20M;

    • uncomment the code blocks of parameter settings for Last.FM;

    • $ python main.py
      

kgcn's People

Contributors

hwwang55 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kgcn's Issues

Calculation of AUC metric

def ctr_eval(sess, model, data, batch_size):
    start = 0
    auc_list = []
    f1_list = []
    while start + batch_size <= data.shape[0]:
        auc, f1 = model.eval(sess, get_feed_dict(model, data, start, start + batch_size))
        auc_list.append(auc)
        f1_list.append(f1)
        start += batch_size
    return float(np.mean(auc_list)), float(np.mean(f1_list))

In this function, you calculate AUC in every batch and take their average value as the final AUC. But as far as I know AUC needs to be globally sorted in the test set. Can you explain it ?

Book Data

Hi, Mr. Wang,

I noticed that on main.py there is a block of code to run the program for the Book Crossing dataset which was also mentioned in the paper. However, I think it seems missing some data required to run it. Although I downloaded the Book Crossing dataset from a previous Issue, I could not find other files such as item_index2entitty and kg.txt that are present in both the music and movie modules.

Would it be possible for you to share these files with me, please? Your assistance would be greatly appreciated.

Thank you and best regards,
Luthfi.

Where is the data set for the book

Hello Mr or Miss Wang,
In the data folder, why can't I find the data set for the book? And
excuse me, how did you get the data and process it into a usable data set? I want to try to make a new one.

关于预测的问题

你好,我在写预测模块的时候,发现预测的数据量跟训练模型的batch_size有关,预测的时候喂的数据量必须和模型训练时的batch_size相等才可以预测出结果,我想咨询一下这个是什么原因引起的?

有些不懂的地方

comment the code blocks of parameter settings for MovieLens-20M;

uncomment the code blocks of parameter settings for Last.FM;

请问这两句是什么意思啊

Request for the rating.csv file for movie data set

Hi Dr. Wang,

Thanks for sharing the code and data sets for the paper. The movie dataset looks like missing one file. Could you please also share the ratings.csv file for the movie data? Thanks a lot for your help!

Best,
Duna

kg.txt编号

请问用自己数据构造kg.txt文件时,头实体代表用户,尾实体代表商品,是否用户编号和商品编号不能有重复,不然就无法进行区分?

感觉这一步计算量好大

self.l2_loss = tf.nn.l2_loss(self.user_emb_matrix) + tf.nn.l2_loss(
self.entity_emb_matrix) + tf.nn.l2_loss(self.relation_emb_matrix)

请问作者有啥可以优化的办法 或者 替换的办法呢?

比如 我的 emb 大小是 [100000000, 32]

pytorch implement

Hello, Professor Wang. Will you use pytorch to implement KGCN in the future?

指标问题

epoch 29 train auc: 0.9121 f1: 0.8071 eval auc: 0.6914 f1: 0.6629 test auc: 0.7041 f1: 0.6715
①论文里面的指标是用的训练、评估、测试哪个阶段啊?
②上面这个是我用自己的数据跑出来的,请问又有问题吗?

咨询一下kg.txt 每个字段的具体含义

你好,music里面txt里面每个字段具体什么含义呢,例如:2086 music.artist.origin 3846。
我看代码里面对应的是head_old,relation_old,tail_old。看着我很懵,可以做一个比较通俗易懂的解释吗?

question about computing user-relation score

image

Hi, Wang! Should we use tf.reduce_sum to compute user_relation_scores instead of tf.reduce_mean? Since you said it's a inner product operation in the paper. Thanks in advance.

kg图数据处理问题

你好,KGCN这个方法相对于其他方法很新颖。我在构建KG图的时候,数据量较小时,处理速度比较快。如果数据量较大时,处理数据的时间就成倍增加。王工对于kg图构建优化上有没有研究?

Request of Book-Crossing Dataset

Dear Dr. Wang,

Thanks for your wonderful work and clear implementation. I wonder whether you could provide a copy of your processed Book-Crossing dataset with knowledge graph as described in the paper. My e-mail address is [email protected]. Thanks again.

Best

Where is kg.txt come from?

Hi, I am learning knowledge graph based recommender system, and your work is great!
However I have some trouble getting the dataset. For the movielens dataset I know where to download, but for the kg.txt file, I don't know where it comes from. I can't get access to the website https://www.satori.com/. So where do you get the satori knowledge graph dataset?

问下关于KGNN_LS对比KGCN的提升

KGNN_LS 的loss 需要引入interaction_hashtable. 这里需要读入全部的训练数据集,如果是训练数据集很大的话,会非常消耗资源(因为全量数据集需要进tensorflow内存)所以问下 这个loss的加入有多大的提升?

About dependency

Hi, could you specify the dependency version (e.g. python3.6 or tf 1.4 etc.) in the repo?

item_index2entity_id.txt

Hello Mr. Wang I'm trying to reproduce your model on a new dataset that uses jurisprudemce documents. My only question is how was the item_index2entity_id.txt dataset generated? Are the item ids the items that appear in the knowledge group only as a tail entity?

Or are the item ids the items that sppear in the knowledge graph as both head and tail entities? For example if a movie item is also a head entity in triple A and is also a tail entity in another triple, say triple B would this be criteria for the item to be part in the item_index2entity_id.txt file or is it enough that an item appear as a tail entity as you've detailed in your MKR paper?

有些关于aggregate不懂的地方,求教

老师好,请问aggregate method里面这段要怎么理解?最后return的entity_vectors[0]是不是之和第一次iteration(hop=0)有关?并且每次iteration里aggregator的input都是相互独立的?

Screen Shot 2022-07-26 at 8 26 08 AM

Is KGCN for Recommender System is Inductive in Nature ?

Hi,

I am going through the literature of the paper and one thing which I find missing is information about inductiveness of the proposed algorithm.

So my question is, Is the proposed architecture inductive in nature i.e. it can generalise to new users as well without retraining?

Thanks
Sachin

Creation of kg.txt

Hello Wang,
How do we create the knowledge graph file (kg.txt) if we are to train this network on a movieLens 100k dataset?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.