Giter Club home page Giter Club logo

Comments (7)

ayp-google avatar ayp-google commented on May 17, 2024 2

For retraining, we re-initialize from scratch for simplicity and find that is does not matter. You can try an experiment to reuse the weights during structure learning. Just be careful to copy the tensors correctly as the TF graph will be different (if you resize the convolutions).

from morph-net.

pkch avatar pkch commented on May 17, 2024

The weights trained during the MorphNet structure learning phase are not intended for use in the final inference. The retraining is (at present) a required step to achieve good performance from the pruned model.

Of course, you can still examine the weights trained during the structure learning phase (for example, to analyze them, or to come up with your own extensions to MorphNet). They are available as checkpoints in the Tensorflow training directory, where the trainer (usually) saves them. This is no different from how you'd save/restore weights without MorphNet.

from morph-net.

shishichang avatar shishichang commented on May 17, 2024

@pkch Thank you for your reply. The weights trained during the MorphNet structure learning phase has been also well optimized. So that may be used as initialization for retraining the new structure.

from morph-net.

eladeban avatar eladeban commented on May 17, 2024

You could try to reuse the weight, we did not have a very positive experience with that, but it could work for you. In addition there is some research that suggest that training from scratch is actually more useful.

I would also would like to point out that depending on the model architecture it could be a bit tedious to a new architecture.

from morph-net.

ayp-google avatar ayp-google commented on May 17, 2024

Just to add to the previous comments, reusing the previous weights can be tricky because the network structure has changed. Thus, the shapes of the weight tensors need to change as well. For example, if you remove a channel from a convolution, then that filter needs to be removed from the weights, AND the convolutions consuming the output of the first convolution need to have weights removed as well because one of the inputs has been removed. In theory this is possible, but it is hard to implement.

from morph-net.

monkeyhippies avatar monkeyhippies commented on May 17, 2024

When retraining, are you supposed to use the same initialization, or does it not matter?

from morph-net.

smohan10 avatar smohan10 commented on May 17, 2024

I have a general question for Resnet V1 50 on ImageNet dataset.

After stage 1, let's say I take the alive_1000 JSON file after training step 1000, figure out the best activation channels needed and update the model.py.
It is suggested from this discussion as to retrain from scratch. What hyper-parameters should be used for this case? Has anyone tried to retrain a new slimmed version of Resnet model using TF slim library?

I tried to run a few experiments with different hyper-parameters, seems like its difficult to converge. Hoping someone has achieved convergence for this model.

from morph-net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.