Giter Club home page Giter Club logo

qrc-net's People

Contributors

kanchen-usc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qrc-net's Issues

关于CPN部分的疑惑

尊敬的陈博士:
您好,非常抱歉打扰您,,我在阅读您关于CPN(即强化学习部分)有一些疑惑:
`

def train_loss(self,att_scores, labels, reward, reward_pred, is_multi=False,pos_all=None,pos_reg_all=None,num_reg=None):

  | att_logits = tf.reshape(att_scores[:, :, 0], [self.batch_size, self.num_prop])
  | loss_vec=tf.nn.sparse_softmax_cross_entropy_with_logits(att_logits, labels, name=None)
  | loss_cls = tf.reduce_mean(loss_vec)
  |  
  | # calculate reward for top k predictions for reinforcement learning
  | _, pred_ind = tf.nn.top_k(att_logits, self.top_k)
  | pred_ind = tf.reshape(pred_ind, [self.batch_sizeself.top_k, 1])
  | row_ind = tf.reshape(tf.range(0, self.batch_size), [-1, 1])
  | row_ind = tf.reshape(tf.tile(row_ind, [1, self.top_k]), [self.top_k
self.batch_size, 1])
  | pred_ind = tf.concat(1, [row_ind, pred_ind])
  | pred_reward = tf.gather_nd(reward, pred_ind)
  | pred_reward = tf.reshape(pred_reward, [self.batch_size, self.top_k])
  | reward_weight = tf.reduce_mean(pred_reward, 1)
  |  
  | if is_multi:
  | pred_reg = tf.gather_nd(att_scores[:, :, 1:], pos_all)
  | loss_reg = loss_func.smooth_l1_regression_loss(pred_reg, pos_reg_all)/num_reg
  | else:
  | pred_label = tf.cast(tf.reshape(tf.argmax(att_logits, 1), [-1, 1]), tf.int32)
  | # pred_label = labels
  | row_index = tf.reshape(tf.range(0, self.batch_size), [-1, 1])
  | pred_index = tf.concat(1, [row_index, pred_label])
  | pred_reg = tf.gather_nd(att_scores[:, :, 1:], pred_index)
  | loss_reg = loss_func.smooth_l1_regression_loss(pred_reg, self.gt_reg)
  |  
  | loss = loss_cls + self.reg_lambda*loss_reg
  | loss_rwd = tf.nn.l2_loss(reward_weight-reward_pred)
  | return loss, loss_vec, reward_weight, loss_rwd

`
这一部分代码可以说是您CPN的主体部分了,reward_loss只更新了将平均化的top10的特征预测为一个pred_reward的权重,和phrase localization部分的权重没有关系,唯一和phrase localization部分有关系的就是:
loss_reg = loss_func.smooth_l1_regression_loss(pred_reg, self.gt_reg)
这个Loss,可是这个Loss只是直接提取了得分最高的proposal,然后算它的bounding box regression,和强化学习部分没有任何关系,因为您是用tf.argmax来提取的,即不论强化学习部分怎么算,您提取的都是最大的。您在论文里提到的使用公式7来产生梯度,也说了使用top rank 的proposals产生的梯度,但是这个top rank的产生是使用tf.argmax函数,没有和强化学习有任何关系……
非常抱歉打扰您也感谢您的邮件,这些是我的一些理解,但是我不确定我理解是否正确,因为论文里面belta参数改变,结果会变,但是我的理解是这部分和网络完全没有任何关系,结果应该不会变的。所以我可能理解的不太对,希望陈博士可以帮我解答一下~~

Sources of Proposals and ROI Features

Hi,

I would like to make sure the sources of the download links to proposals and ROI features in terms of the Flickr30K entity dataset.
In this paper, you mentioned three proposals:

  1. Selective Search
  2. RPN
  3. PGN
    But only the proposals by Selective Search is available for download, right?
    If that is the case, are the other two available for download? Are they very different visually?

Regarding the ROI features, the README said the rows corresponds to the proposals, coming from the pre-trained RPN. Then how is it possible for this RPN to produce ROIs in the same order with those produced by Selective Search? Could you clarify a little bit?

Therefore, in either case, the proposals and features are not those finally produced by your work, QRCNet, right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.