Fail to train in mini-Imagenet about pytorch-classification-uncertainty HOT 5 OPEN

dougbrion commented on June 12, 2024

Fail to train in mini-Imagenet

from pytorch-classification-uncertainty.

Comments (5)

RuoyuChen10 commented on June 12, 2024

I use the edl loss to train in mini-imagenet dataset with 64 classes, but the loss can't converge and the accuracy is very low.

I think you can modify the parameters:

def edl_loss(self, func, y, alpha, annealing_step, device="cuda"):
        y = self.one_hot_embedding(y)
        y = y.to(device)
        alpha = alpha.to(device)
        S = torch.sum(alpha, dim=1, keepdim=True)
        
        A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)

        # annealing_coef = torch.min(
        #     torch.tensor(1.0, dtype=torch.float32),
        #     torch.tensor(self.epoch / annealing_step, dtype=torch.float32),
        # )
        annealing_coef = 0.1

        kl_alpha = (alpha - 1) * (1 - y) + 1
        kl_div = annealing_coef * self.kl_divergence(kl_alpha, device=device)
       
        return A + kl_div

set annealing_coef as 0.1 or lower setting will work, do not set 1, it's too large.

from pytorch-classification-uncertainty.

xuhuali-mxj commented on June 12, 2024

I use the edl loss to train in mini-imagenet dataset with 64 classes, but the loss can't converge and the accuracy is very low.

I think you can modify the parameters:

def edl_loss(self, func, y, alpha, annealing_step, device="cuda"):
        y = self.one_hot_embedding(y)
        y = y.to(device)
        alpha = alpha.to(device)
        S = torch.sum(alpha, dim=1, keepdim=True)
        
        A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)

        # annealing_coef = torch.min(
        #     torch.tensor(1.0, dtype=torch.float32),
        #     torch.tensor(self.epoch / annealing_step, dtype=torch.float32),
        # )
        annealing_coef = 0.1

        kl_alpha = (alpha - 1) * (1 - y) + 1
        kl_div = annealing_coef * self.kl_divergence(kl_alpha, device=device)
       
        return A + kl_div

set annealing_coef as 0.1 or lower setting will work, do not set 1, it's too large.

Hi, thank you for your answer. I try to set 'annealing_coef' as 0.1 and 0.05 respectively, but it still not works. Do you let it works successfully?

from pytorch-classification-uncertainty.

RuoyuChen10 commented on June 12, 2024

I use the edl loss to train in mini-imagenet dataset with 64 classes, but the loss can't converge and the accuracy is very low.

I think you can modify the parameters:

def edl_loss(self, func, y, alpha, annealing_step, device="cuda"):
        y = self.one_hot_embedding(y)
        y = y.to(device)
        alpha = alpha.to(device)
        S = torch.sum(alpha, dim=1, keepdim=True)
        
        A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)

        # annealing_coef = torch.min(
        #     torch.tensor(1.0, dtype=torch.float32),
        #     torch.tensor(self.epoch / annealing_step, dtype=torch.float32),
        # )
        annealing_coef = 0.1

        kl_alpha = (alpha - 1) * (1 - y) + 1
        kl_div = annealing_coef * self.kl_divergence(kl_alpha, device=device)
       
        return A + kl_div

set annealing_coef as 0.1 or lower setting will work, do not set 1, it's too large.

Hi, thank you for your answer. I try to set 'annealing_coef' as 0.1 and 0.05 respectively, but it still not works. Do you let it works successfully?

I haven't tried this repo, but I have tried to train an 8631 ID face recognition network, we use resnet-100 is ok, can get the same accuracy on the in-distribution dataset as the softmax training method, resnet-50 can't get to convergence. Another thing is that we all find the KL loss will damage the accuracy, try to decrease the coefficients or just remove it.

from pytorch-classification-uncertainty.

xuhuali-mxj commented on June 12, 2024

I use the edl loss to train in mini-imagenet dataset with 64 classes, but the loss can't converge and the accuracy is very low.

I think you can modify the parameters:
def edl_loss(self, func, y, alpha, annealing_step, device="cuda"):
        y = self.one_hot_embedding(y)
        y = y.to(device)
        alpha = alpha.to(device)
        S = torch.sum(alpha, dim=1, keepdim=True)
        
        A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)

        # annealing_coef = torch.min(
        #     torch.tensor(1.0, dtype=torch.float32),
        #     torch.tensor(self.epoch / annealing_step, dtype=torch.float32),
        # )
        annealing_coef = 0.1

        kl_alpha = (alpha - 1) * (1 - y) + 1
        kl_div = annealing_coef * self.kl_divergence(kl_alpha, device=device)
       
        return A + kl_div
set annealing_coef as 0.1 or lower setting will work, do not set 1, it's too large.
Hi, thank you for your answer. I try to set 'annealing_coef' as 0.1 and 0.05 respectively, but it still not works. Do you let it works successfully?
I haven't tried this repo, but I have tried to train an 8631 ID face recognition network, we use resnet-100 is ok, can get the same accuracy on the in-distribution dataset as the softmax training method, resnet-50 can't get to convergence. Another thing is that we all find the KL loss will damage the accuracy, try to decrease the coefficients or just remove it.

Thank you so much. It still don't works. I think I may need to fine-tune other hyperparameters.

from pytorch-classification-uncertainty.

RuoyuChen10 commented on June 12, 2024

I use the edl loss to train in mini-imagenet dataset with 64 classes, but the loss can't converge and the accuracy is very low.

I think you can modify the parameters:
def edl_loss(self, func, y, alpha, annealing_step, device="cuda"):
        y = self.one_hot_embedding(y)
        y = y.to(device)
        alpha = alpha.to(device)
        S = torch.sum(alpha, dim=1, keepdim=True)
        
        A = torch.sum(y * (func(S) - func(alpha)), dim=1, keepdim=True)

        # annealing_coef = torch.min(
        #     torch.tensor(1.0, dtype=torch.float32),
        #     torch.tensor(self.epoch / annealing_step, dtype=torch.float32),
        # )
        annealing_coef = 0.1

        kl_alpha = (alpha - 1) * (1 - y) + 1
        kl_div = annealing_coef * self.kl_divergence(kl_alpha, device=device)
       
        return A + kl_div
set annealing_coef as 0.1 or lower setting will work, do not set 1, it's too large.
Hi, thank you for your answer. I try to set 'annealing_coef' as 0.1 and 0.05 respectively, but it still not works. Do you let it works successfully?
I haven't tried this repo, but I have tried to train an 8631 ID face recognition network, we use resnet-100 is ok, can get the same accuracy on the in-distribution dataset as the softmax training method, resnet-50 can't get to convergence. Another thing is that we all find the KL loss will damage the accuracy, try to decrease the coefficients or just remove it.
Thank you so much. It still don't works. I think I may need to fine-tune other hyperparameters.

maybe you can refer this https://github.com/RuoyuChen10/FaceTechnologyTool/blob/master/FaceRecognition/evidential_learning.py, I have tried this on Face Recognition. I'm also failed before. I conclude it's mainly about:

remove KL loss
Learning Rate is important.
The depth of the network.

Maybe the learning rate and depth of the network has few influences on softmax and Cross-Entropy Loss training method.

from pytorch-classification-uncertainty.

Fail to train in mini-Imagenet about pytorch-classification-uncertainty HOT 5 OPEN

Comments (5)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent