Giter Club home page Giter Club logo

Comments (5)

JUGGHM avatar JUGGHM commented on June 11, 2024

Thanks for your interest! It takes about 2~3 days to get ENet convergence at around epoch 20 in stage 1 and few hours in stage 2. But training in stage 3 is time consuming and it takes around a week. So if you are equipped with more devices we suggest that you could try higher resolution inputs in stage 3 and adjust the learning rate decay nodes to earlier ones.

from penet_icra2021.

chaytonmin avatar chaytonmin commented on June 11, 2024

Thanks for your interest! It takes about 2~3 days to get ENet convergence at around epoch 20 in stage 1 and few hours in stage 2. But training in stage 3 is time consuming and it takes around a week. So if you are equipped with more devices we suggest that you could try higher resolution inputs in stage 3 and adjust the learning rate decay nodes to earlier ones.

Thanks for your quick reply! It's too long for me to try the model. Anyway, I may try it later.

from penet_icra2021.

graycrown avatar graycrown commented on June 11, 2024

@JUGGHM Thanks for your excellent work. I am trying to re-implement your training process, but there's a few questions confused me:

  1. Whether the larger batch size leads to better performance? E.g. 16 or 32
  2. Whether the higher resolution in stage 3 is better ?
    I use batch size 12 for stage 2 training, but the result seem not improved comparing to stage 1

Looking forward to your reply

from penet_icra2021.

JUGGHM avatar JUGGHM commented on June 11, 2024

@JUGGHM Thanks for your excellent work. I am trying to re-implement your training process, but there's a few questions confused me:

  1. Whether the larger batch size leads to better performance? E.g. 16 or 32
  2. Whether the higher resolution in stage 3 is better ?
    I use batch size 12 for stage 2 training, but the result seem not improved comparing to stage 1

Looking forward to your reply

Thanks for your interest!

(1) Generally larger batch sizes will lead to at least not worse performance. Yet once I tried a batch-size of 12 and failed without adjusting the learning rate decay nodes. But I had successful experience when conducting experiments on B4 Small (half channels). And the performances were similar with a batch size of 10 and 20. Meanwhile the learning rate decay nodes adjusted. So the conclusion is that the hyper parameters needs to be further adjusted with larger batch sizes.

(2) I do think so but I had only 2x11G GPUs when doing this project.

(3) That's it. We don't expect results after stage2 better than stage1. The training procedure in stage2 could be regarded as an initialization step of stage3.

from penet_icra2021.

graycrown avatar graycrown commented on June 11, 2024

@JUGGHM Thanks for your excellent work. I am trying to re-implement your training process, but there's a few questions confused me:

  1. Whether the larger batch size leads to better performance? E.g. 16 or 32
  2. Whether the higher resolution in stage 3 is better ?
    I use batch size 12 for stage 2 training, but the result seem not improved comparing to stage 1

Looking forward to your reply

Thanks for your interest!

(1) Generally larger batch sizes will lead to at least not worse performance. Yet once I tried a batch-size of 12 and failed without adjusting the learning rate decay nodes. But I had successful experience when conducting experiments on B4 Small (half channels). And the performances were similar with a batch size of 10 and 20. Meanwhile the learning rate decay nodes adjusted. So the conclusion is that the hyper parameters needs to be further adjusted with larger batch sizes.

(2) I do think so but I had only 2x11G GPUs when doing this project.

(3) That's it. We don't expect results after stage2 better than stage1. The training procedure in stage2 could be regarded as an initialization step of stage3.

Got it, thanks for your reply, I will try different batch size and adjust the hyper-parameters according to that.

from penet_icra2021.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.