Implementation of the "Deep Matrix Factorization Models for Recommender Systems"

Python 100.00%

deep_matrix_factorization_models's People

Contributors

Stargazers

Watchers

deep_matrix_factorization_models's Issues

Official code for the paper

Do you know if there is any official code for the paper? If you have the link can you please share that?

测试集生成的方式有问题&物品ID有问题

def getTestNeg(self, testData, negNum):
    user = []
    item = []
    # testData是一个列表，列表元素是三元组(user, item, rate)
    for s in testData: # 对于测试集里每一个正例
        # tmp_user和tmp_item用来保存当前用户的测试集样本
        tmp_user = []
        tmp_item = []
        u = s[0]
        i = s[1]
        tmp_user.append(u)
        tmp_item.append(i)
        for t in range(negNum): # 都要生成negNum个负例
            j = np.random.randint(self.shape[1]) # shape[1]是物品数，这里即任取一件物品
            while (u, j) in self.trainDict: # 如果取到的user-item对是已在训练集里的，就重新取
                j = np.random.randint(self.shape[1])
            tmp_user.append(u)
            tmp_item.append(j)
        # user和item用来保存全部用户的测试集样本
        user.append(tmp_user)
        item.append(tmp_item)
    return [np.array(user), np.array(item)]

测试时，每个用户只取100个物品计算预测值，这里生成负例时只排除了出现在trainDict中的，所以如果随机生成的user-item对本身就是测试集中的正例，那么就会造成同一个user-item对在测试集中既为正例又为负例的情况，若采用HR和NDCG评估会使效果偏好。

另外，在getData时，应将离散的ID转换为连续编号，ML-1M数据集有3706部电影，但电影ID最小值是1，最大值是3952，不是连续的。

Incomplete trainset data

In DataSet.py, the func "getTrainTest" lost the last uesr and his ratings for train, if i understand correctly.

how to solve the huge matrix (>2G)

Problem:
when I conducts an experiment on Pinterest-20 dataset, the tensorflow give errors "ValueError: Cannot create a tensor proto whose content is larger than 2GB.".
Errors occur on the "self.user_item_embedding = tf.convert_to_tensor(self.dataSet.getEmbedding())".

Now, I solves it by using tf.variable.
Such as: https://blog.csdn.net/fjssharpsword/article/details/96431553

The change of code is :
def add_embedding_matrix(self): self.matrix_init = tf.placeholder(tf.float32, shape=(self.shape[0], self.shape[1])) matrix = tf.Variable(self.matrix_init) self.user_item_embedding = tf.convert_to_tensor(matrix) #self.user_item_embedding = tf.convert_to_tensor(self.dataSet.getEmbedding()) self.item_user_embedding = tf.transpose(self.user_item_embedding)

def init_sess(self): self.config = tf.ConfigProto() self.config.gpu_options.allow_growth = True self.config.allow_soft_placement = True self.sess = tf.Session(config=self.config) #self.sess.run(tf.global_variables_initializer()) self.sess.run(tf.global_variables_initializer(), feed_dict={self.matrix_init: self.dataSet.getEmbedding()})

Is it works? I don't know! Can you give me other effective solutions, thanks!

when the latent factors is equal to 8 or 16, the HR and NDCG is equal to 1.0

Problem:
When I sets the number of latent factors is 8 or 16, the experimental results of HR and NDCG all are 1.0. It is not reasonable.
parser.add_argument('-userLayer', action='store', dest='userLayer', default=[512, 8]) parser.add_argument('-itemLayer', action='store', dest='itemLayer', default=[1024, 8])

By debugging, I finds that the self.y_ always is 1e-6.
self.y_ = tf.reduce_sum(tf.multiply(user_out, item_out), axis=1, keepdims=False) / (norm_item_output* norm_user_output) self.y_ = tf.maximum(1e-6, self.y_)

In your paper, the experimental results showed that the HR and NDCG is normal when the number of latent factors is 8 or 16, can you give me the solution, thanks!

ruidongz / deep_matrix_factorization_models Goto Github PK

deep_matrix_factorization_models's People

Contributors

Stargazers

Watchers

Forkers

deep_matrix_factorization_models's Issues

Official code for the paper

测试集生成的方式有问题&物品ID有问题

Incomplete trainset data

how to solve the huge matrix (>2G)

when the latent factors is equal to 8 or 16, the HR and NDCG is equal to 1.0

从这个结果来看，好像并没有论文里那么好

关于复现效果问题

Have you reach the hit ratio in the paper?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent