I am currently trying out the ERFNet on the cityscapes dataset. For that I use my own

Not able to reproduce validation set accuracy about erfnet_pytorch HOT 7 CLOSED

mbcel commented on June 2, 2024

Not able to reproduce validation set accuracy

from erfnet_pytorch.

Comments (7)

Eromera commented on June 2, 2024

Hi! Sorry for the late reply.

It's hard to say where should be the difference. All hyperparameters seem quite similar. Batch size has a great effect but going from 6 to 5 should not make that big difference. Where are you getting the 62.3%? In the output of the main.py during training or using the cityscapes scripts after saving the outputs?

The only thing that sounds problematic is that you mention input normalization to [-1,1]. If you normalize the RGB images during training then you should also normalize them during the evaluation or else the model would not be "used" to those pixel values and this could easily justify the 7% loss in IoU. Are you normalizing when you evaluate in the Val set?

from erfnet_pytorch.

mbcel commented on June 2, 2024

I am getting the 62.3% on the 500 validation set images from the cityscapes dataset. I am setting the model in evaluation mode and I use my own script for evaluation. I also do normalization on the val set images.

How do you normalize the input images? Did you select the model from best working epoch or simply the last epoch?

I will try out your evaluation script and training script and see if I might can find a difference. But thank you for the help!

from erfnet_pytorch.

Eromera commented on June 2, 2024

Hi,
In the main code normalization is not used but you can normalize with torchvision.transforms.Normalize.
I normally use the model with best acc in the val set.

When you train from scratch, are you training encoder and then decoder or only train the full network once without pretraining encoder?

from erfnet_pytorch.

mbcel commented on June 2, 2024

I only train the network once with encoder and decoder together.

from erfnet_pytorch.

Eromera commented on June 2, 2024

Then that should explain the difference. I get 69% by training encoder first in downsampled cityscapes labels (128x64) for some epochs, and then attach the decoder and train the full network again for some epochs. If you train the full network without pretraining encoder, then the encoder filters take much more to "shape" from random init. You could try pretraining encoder or alternatively if you want to train full network in 1 pass then you could a) use an auxiliary loss at encoder's output or b) train for more epochs (maybe 300) or with a different LR schedule (maybe higher LR at the beginning).

from erfnet_pytorch.

mbcel commented on June 2, 2024

Ah okay I gonna try that out and see if that gives me the higher val set accuracy. Thank you very much!

from erfnet_pytorch.

mbcel commented on June 2, 2024

Okay I now do achieve 69% mIoU from scratch.

The reason for me not being able to achieve your results was the learning rate schedule. I wasn't able to achieve that accuracy with the e-function schedule you used in your code but instead I achieved it with a step lr-schedule similar to the schedule you described in your paper. Reducing lr by 0.5 is thereby important, I used 1/10 first which did not achieve this high performance.

Thank you for your help!

from erfnet_pytorch.

Not able to reproduce validation set accuracy about erfnet_pytorch HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent