Giter Club home page Giter Club logo

Comments (7)

Eromera avatar Eromera commented on June 2, 2024

Hi! Sorry for the late reply.

It's hard to say where should be the difference. All hyperparameters seem quite similar. Batch size has a great effect but going from 6 to 5 should not make that big difference. Where are you getting the 62.3%? In the output of the main.py during training or using the cityscapes scripts after saving the outputs?

The only thing that sounds problematic is that you mention input normalization to [-1,1]. If you normalize the RGB images during training then you should also normalize them during the evaluation or else the model would not be "used" to those pixel values and this could easily justify the 7% loss in IoU. Are you normalizing when you evaluate in the Val set?

from erfnet_pytorch.

mbcel avatar mbcel commented on June 2, 2024

I am getting the 62.3% on the 500 validation set images from the cityscapes dataset. I am setting the model in evaluation mode and I use my own script for evaluation. I also do normalization on the val set images.

How do you normalize the input images? Did you select the model from best working epoch or simply the last epoch?

I will try out your evaluation script and training script and see if I might can find a difference. But thank you for the help!

from erfnet_pytorch.

Eromera avatar Eromera commented on June 2, 2024

Hi,
In the main code normalization is not used but you can normalize with torchvision.transforms.Normalize.
I normally use the model with best acc in the val set.

When you train from scratch, are you training encoder and then decoder or only train the full network once without pretraining encoder?

from erfnet_pytorch.

mbcel avatar mbcel commented on June 2, 2024

I only train the network once with encoder and decoder together.

from erfnet_pytorch.

Eromera avatar Eromera commented on June 2, 2024

Then that should explain the difference. I get 69% by training encoder first in downsampled cityscapes labels (128x64) for some epochs, and then attach the decoder and train the full network again for some epochs. If you train the full network without pretraining encoder, then the encoder filters take much more to "shape" from random init. You could try pretraining encoder or alternatively if you want to train full network in 1 pass then you could a) use an auxiliary loss at encoder's output or b) train for more epochs (maybe 300) or with a different LR schedule (maybe higher LR at the beginning).

from erfnet_pytorch.

mbcel avatar mbcel commented on June 2, 2024

Ah okay I gonna try that out and see if that gives me the higher val set accuracy. Thank you very much!

from erfnet_pytorch.

mbcel avatar mbcel commented on June 2, 2024

Okay I now do achieve 69% mIoU from scratch.

The reason for me not being able to achieve your results was the learning rate schedule. I wasn't able to achieve that accuracy with the e-function schedule you used in your code but instead I achieved it with a step lr-schedule similar to the schedule you described in your paper. Reducing lr by 0.5 is thereby important, I used 1/10 first which did not achieve this high performance.

Thank you for your help!

from erfnet_pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.