Hi there~ I see that the OIM Loss is a little different to that desc

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

The weight <a href="https://gist.github.com/DeanChan/5e33d6642586

CQ for OIM Loss about open-reid HOT 20 CLOSED

cysu commented on May 26, 2024

CQ for OIM Loss

from open-reid.

Comments (20)

Cysu commented on May 26, 2024 4

Oh. That's it. The oim-scalar is very important. Maybe 100 is too large. We use 10 in our caffe's implementation. You can tweak and experiment with it.

By the way, sometimes reducing its value along the training epochs also improves the performance. It's called annealing in some literature.

from open-reid.

dichen-cd commented on May 26, 2024 1

Then I suppose OIM would be almost the same as Softmax+Xentropy 'cuz the external buffer (LUT) serves as a projection matrix from feature space to class probability space. The only differences between OIM and Softmax+Xentropy is 1)the update strategy and 2)the L2 normalization of LUT/projection matrix.

Is this assumption right? If so, then why does OIM perform so much better than Softmax+Xentropy?

from open-reid.

Cysu commented on May 26, 2024

Because traditional person re-identification datasets don't have unlabeled identities, which opposes the person search setting.

from open-reid.

Cysu commented on May 26, 2024

Yes. It's correct! The reason why OIM is better than softmax loss for verification is probably that

OIM does not have 2048 x 5000 = 10,240,000 parameters to learn, which avoids overfitting.
OIM can utilize many background people that have no IDs, while softmax loss cannot.

from open-reid.

dichen-cd commented on May 26, 2024

Thank you for the quick response!

Excellent work on OIM and this repo!

from open-reid.

dichen-cd commented on May 26, 2024

Hi @Cysu . Sorry to bother you again 😭 But I've got problems adding circular queue to oim. Could you have a look at it if it's convenient?

Code is here: https://gist.github.com/DeanChan/5e33d66425862e8a318dcfdb4ca98cc4

It's basically based on your code. But I found that it could not converge during training. The loss is always wandering around 28.0~31.0. I tried for several days and couldn't find the reason. Hope you could enlighten me. 😄

from open-reid.

Cysu commented on May 26, 2024

The weight here is used to reweight classes in cross entropy loss. It should be simply self.weight = weight.
At this line, it should be x.view(1, -1).
Actually I wouldn't recommend appending a new item and constructing a new CQ each time. It's inefficient. You may consider using a header to indicate at which index we should put the new item. Something like

self.cq[self.header] = x
self.header = (self.header + 1) % self.cq.size(0)

from open-reid.

dichen-cd commented on May 26, 2024

Many thanks for your response!

To the first point, I thought the probability of x belonging to some unlabeled classes should be filtered out. 'Cuz based on the loss function in the paper, $\mathcal{L} = \mathbb{E}_x[\log p_t]$, and $q_i$ is not included. So I think the weight of unlabeled classes should be set to zero. Correct me if I was wrong.

The result of x.view(1, -1) is the same as x.view(-1, x.size(0)). But your code is much more neat!

Awesome on the third point! That one's really clever!

from open-reid.

Cysu commented on May 26, 2024

Ok, I got it. You assign a random label in [L, L+U) for unlabeled samples. I think it's correct. Could you please double check if the targets are in this range?

from open-reid.

dichen-cd commented on May 26, 2024

Yes, I've checked. Targets are right in this range. LUT and CQ can be updated correctly as well.

The dataloader is adapted from psdb.py. Dataset is the same and no augmentation is applied. The annotations returned by __getitem__ method of the dataloader would be self.targets_db[index]. And if there's -1 in self.targets_db[index]['gt_pids'], it would be replaced with a random integer between [L, L+U).

ps. I think there's minor error at the comment here psdb.py line 81. Back ground people would have pid == -1 instead of 0.

from open-reid.

Cysu commented on May 26, 2024

Thanks for point out the error in comments! Will fix it later.

May I know a little bit more your experiment settings? Do you mean to use the ground truth boxes of our person search dataset for re-id? Could you please try to comment out the CQ code and use only the labeled identities? And see if it works properly in such case?

from open-reid.

dichen-cd commented on May 26, 2024

Nope, it's not just for re-id, but the same as the person search setting in your paper. I'm currently training detection and Identification jointly with a bbox regression loss and oim loss.

The regression loss is going down fine at all settings.
If replace oim loss with Softmax Loss, then both two losses are going down nicely.
If provide labeled samples only, then oim loss is not going down either.
If drop the cq part in oim and provide labeled samples only, the problem is the same.

While the example code in this repository with an oim loss works really fine.

I've tried several initialization scheme for lut and cq, the problem is the same. So it won't be initializer's fault.

from open-reid.

Cysu commented on May 26, 2024

OK. I got it. In our caffe's implementation, the proposal target layer would produce target -1 for unlabeled, [0, L-1] for labeled, and L for false detections (background regions without person). At last when computing the cross entropy loss, we set -1 for both the unlabeled and false detections, and have ignore_label: -1 in the loss layer.

I wonder if these three types of bounding boxes proposals are handled properly?

from open-reid.

dichen-cd commented on May 26, 2024

Ha, I see!

In my implementation, I set 0 for false detections, [1, L] for labeled person and [L+1, L+U] for unlabeled. In the cross entropy loss, only labels between [L+1, L+U] are ignored. Maybe I have to ignore label 0 too.

Many thanks for your advice! I'll let you know if it works.

from open-reid.

Cysu commented on May 26, 2024

I think it's better to make the target of labeled persons start from zero. Otherwise the indexing of the LUT should be carefully changed.

from open-reid.

dichen-cd commented on May 26, 2024

Nice suggestion! I'll fix it. 😄

from open-reid.

dichen-cd commented on May 26, 2024

Pity to see it's not working either. 😢

Loss drops from (28.0, 31.0) to (1.0, 6.0), yet still wandering and not going down at all.

from open-reid.

dichen-cd commented on May 26, 2024

Hi Cysu~ Finally I got the solution. After changing --oim-scalar from the default value 1.0 to 100.0, oim loss can go down as expected. I suppose it's because the L2 normalization opration decreases gradient drastically, thus we can solve this problem by simply multiplying a large scalar.

Thanks again for your help!!

from open-reid.

zhongyingji commented on May 26, 2024

Hi, I've got a problem when training with the oim. I set --oim-scalar to 100, and the later epochs, the oimloss becomes nan, I get no idea on what's happening.
BTW, what's the range of oimloss at the end of training?
Thank you!

from open-reid.

haochange commented on May 26, 2024

Hi, I've got a problem when training with the oim. I set --oim-scalar to 100, and the later epochs, the oimloss becomes nan, I get no idea on what's happening.
BTW, what's the range of oimloss at the end of training?
Thank you!

also feel confused at the range of oimloss.

from open-reid.

CQ for OIM Loss about open-reid HOT 20 CLOSED

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent