Giter Club home page Giter Club logo

deep_matrix_factorization_models's People

Contributors

ruidongz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

deep_matrix_factorization_models's Issues

测试集生成的方式有问题&物品ID有问题

def getTestNeg(self, testData, negNum):
    user = []
    item = []
    # testData是一个列表,列表元素是三元组(user, item, rate)
    for s in testData: # 对于测试集里每一个正例
        # tmp_user和tmp_item用来保存当前用户的测试集样本
        tmp_user = []
        tmp_item = []
        u = s[0]
        i = s[1]
        tmp_user.append(u)
        tmp_item.append(i)
        for t in range(negNum): # 都要生成negNum个负例
            j = np.random.randint(self.shape[1]) # shape[1]是物品数,这里即任取一件物品
            while (u, j) in self.trainDict: # 如果取到的user-item对是已在训练集里的,就重新取
                j = np.random.randint(self.shape[1])
            tmp_user.append(u)
            tmp_item.append(j)
        # user和item用来保存全部用户的测试集样本
        user.append(tmp_user)
        item.append(tmp_item)
    return [np.array(user), np.array(item)]

测试时,每个用户只取100个物品计算预测值,这里生成负例时只排除了出现在trainDict中的,所以如果随机生成的user-item对本身就是测试集中的正例,那么就会造成同一个user-item对在测试集中既为正例又为负例的情况,若采用HR和NDCG评估会使效果偏好。

另外,在getData时,应将离散的ID转换为连续编号,ML-1M数据集有3706部电影,但电影ID最小值是1,最大值是3952,不是连续的。

Incomplete trainset data

In DataSet.py, the func "getTrainTest" lost the last uesr and his ratings for train, if i understand correctly.

how to solve the huge matrix (>2G)

Problem:
when I conducts an experiment on Pinterest-20 dataset, the tensorflow give errors "ValueError: Cannot create a tensor proto whose content is larger than 2GB.".
Errors occur on the "self.user_item_embedding = tf.convert_to_tensor(self.dataSet.getEmbedding())".

Now, I solves it by using tf.variable.
Such as: https://blog.csdn.net/fjssharpsword/article/details/96431553

The change of code is :
def add_embedding_matrix(self): self.matrix_init = tf.placeholder(tf.float32, shape=(self.shape[0], self.shape[1])) matrix = tf.Variable(self.matrix_init) self.user_item_embedding = tf.convert_to_tensor(matrix) #self.user_item_embedding = tf.convert_to_tensor(self.dataSet.getEmbedding()) self.item_user_embedding = tf.transpose(self.user_item_embedding)

def init_sess(self): self.config = tf.ConfigProto() self.config.gpu_options.allow_growth = True self.config.allow_soft_placement = True self.sess = tf.Session(config=self.config) #self.sess.run(tf.global_variables_initializer()) self.sess.run(tf.global_variables_initializer(), feed_dict={self.matrix_init: self.dataSet.getEmbedding()})

Is it works? I don't know! Can you give me other effective solutions, thanks!

when the latent factors is equal to 8 or 16, the HR and NDCG is equal to 1.0

Problem:
When I sets the number of latent factors is 8 or 16, the experimental results of HR and NDCG all are 1.0. It is not reasonable.
parser.add_argument('-userLayer', action='store', dest='userLayer', default=[512, 8]) parser.add_argument('-itemLayer', action='store', dest='itemLayer', default=[1024, 8])

By debugging, I finds that the self.y_ always is 1e-6.
self.y_ = tf.reduce_sum(tf.multiply(user_out, item_out), axis=1, keepdims=False) / (norm_item_output* norm_user_output) self.y_ = tf.maximum(1e-6, self.y_)

In your paper, the experimental results showed that the HR and NDCG is normal when the number of latent factors is 8 or 16, can you give me the solution, thanks!

关于复现效果问题

跑了下这个代码,效果和论文中单layer也就是 DMF-1-nce 差不多;
image

怎么修改参数去对照论文中的两层效果呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.