Giter Club home page Giter Club logo

Comments (2)

titu1994 avatar titu1994 commented on June 14, 2024

Are you sure you are converting within 1-2 epochs with Adam?

In any case, this technique should work with Adam as well, since it will just scale Adams final gradient computations. However, the improvements won't be as dramatic as using simple SGD. However, small improvements will still be there due to ensemble effect.

If you are converting that fast, then you can set number of epochs to 10 or 20 instead of 300, and get a snapshot every 2-4 epochs.

I personally have stopped using this technique since the relative performance improvement is not worth the linear increase in evaluation time.

Since DenseNet FCN has an already large inference time, I would not suggest using this technique for production systems. For research, it is fine.

To answer your questions in the order you stated above :

  1. Yes, it can be used with Adam
  2. Ofcourse. It was originally meant to help SGD in the first place.
  3. I've only tried on Cifar 10. There, the default settings of the paper worked the best.
  4. Yes you can initialize the model with pretrained weights. The improvements won't be much however even after ensembled predictions.
  5. It does not affect training time. However, it linearly increases prediction time if you use ensemble averaging over all snapshots (by a factor of how many snapshots there are)

from snapshot-ensembles.

titu1994 avatar titu1994 commented on June 14, 2024

Do note that for point 4, it can cause a bad initialization for the full dataset training when the distribution of the full dataset is different from the distribution of the smaller dataset.

To partially avoid this, try to sample your smaller dataset in the same proportions as the large dataset. But with semantic Segmentation, and such large quantities of data, I think the effect of poor initialization for the full set is going to be very small.

from snapshot-ensembles.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.