Giter Club home page Giter Club logo

Comments (8)

Jacobxz avatar Jacobxz commented on June 15, 2024

看岔了不好意思

from hhcl-reid.

Jacobxz avatar Jacobxz commented on June 15, 2024

您好,我问一下当采用cmhybrid时,这种损失函数为什么设计成如下loss = self.hard_weight * (self.cross_entropy(output_hard, targets) + (1 - self.hard_weight) * self.cross_entropy(output_mean, targets))

还有就是,self.hard_weight怎么确定的,其他超参数会影响吗,您是先确定哪一个参数呢?

from hhcl-reid.

hu-zheng avatar hu-zheng commented on June 15, 2024

您好,我问一下当采用cmhybrid时,这种损失函数为什么设计成如下loss = self.hard_weight * (self.cross_entropy(output_hard, targets) + (1 - self.hard_weight) * self.cross_entropy(output_mean, targets))

还有就是,self.hard_weight怎么确定的,其他超参数会影响吗,您是先确定哪一个参数呢?

  1. 这类包含多项的损失函数,通常都每项前乘上一个系数,这些系数作为超参数来设置对应项在整体loss中的占比。
  2. 本文中的self.hard_weight,只是在其取值区间内通过设置不同的值进行多组实验,取了实验结果中最好的一个设置;通过观察这些实验结果一般只能得到一个大概的趋势,而这个值也并不一定是最优的;另外多个超参数之间肯定是会有影响的,如果要想准确得到超参数的最优解可能需要用到参数搜索的策略。

from hhcl-reid.

Jacobxz avatar Jacobxz commented on June 15, 2024

from hhcl-reid.

Jacobxz avatar Jacobxz commented on June 15, 2024

from hhcl-reid.

hu-zheng avatar hu-zheng commented on June 15, 2024

手动调参是一件很费时费力、需要经验的事。对于一个参数,有时候可能并不是聊胜于无,可能也并非是越大越好。要确定一个较优的参数确实要做大量实验。
我调参一般是优先确定自己最关心的、影响比较大的参数(比如:我们方法里的self.hard_weight),但这个前提是需要在相对一组对靠谱的超参基础下实验;而对那些不太关心的超参数,可以先参考之前的一些工作,甚至follow他们的配置,如果有必要之后再微调。

from hhcl-reid.

Jacobxz avatar Jacobxz commented on June 15, 2024
class CM_Ave(autograd.Function):

    @staticmethod
    def forward(ctx, inputs, targets, features, momentum):
        ctx.features = features
        ctx.momentum = momentum
        ctx.save_for_backward(inputs, targets)
        outputs = inputs.mm(ctx.features.t())

        return outputs

    @staticmethod
    def backward(ctx, grad_outputs):
        inputs, targets = ctx.saved_tensors

        grad_inputs = None
        if ctx.needs_input_grad[0]:
            grad_inputs = grad_outputs.mm(ctx.features)

        batch_centers = collections.defaultdict(list)
        for instance_feature, index in zip(inputs, targets.tolist()):
            batch_centers[index].append(instance_feature)

        for index, features in batch_centers.items():
            distances = []
            for feature in features:
                distance = feature.unsqueeze(0).mm(ctx.features[index].unsqueeze(0).t())[0][0]
                distances.append(distance.cpu().numpy())
           
            # 平均
            mean = torch.stack(features, dim=0).mean(0)
            ctx.features[index] = ctx.features[index] * ctx.momentum + (1 - ctx.momentum) * mean
            ctx.features[index] /= ctx.features[index].norm()

        return grad_inputs, None, None, None


def cm_ave(inputs, indexes, features, momentum=0.5):
    return CM_Ave.apply(inputs, indexes, features, torch.Tensor([momentum]).to(inputs.device))

image
你好,我根据代码复现了一下平均这个情况,代码如上,能不能帮我看看对不对,
还有我用上述代码实验结果偏差有点大,在想是不是代码有问题,能帮我看看吗,麻烦,我的batchsize为128
image

from hhcl-reid.

hu-zheng avatar hu-zheng commented on June 15, 2024

batchsize=128效果可能会略低于256,但是是不会有这么大差别。按我原始的代码把hard_weight设为0也这么低吗?
我看你的这部分代码应该是没有问题,你检查一下features或者momentum初始化对不对,别的地方有没有问题

from hhcl-reid.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.