Giter Club home page Giter Club logo

Comments (9)

w4nderlust avatar w4nderlust commented on May 18, 2024 1

Yes, it saves the best weights if the performance on validation improves. I can think about adding a --skip_save_best_model parameter, but then the model will never be saved. There could be some use cases where that could be useful (hyperparameters search for instance). I may also try to find a way to keep the best model weights copied in memory and just saved at the end. Will add this to the list of enhancements.

from ludwig.

w4nderlust avatar w4nderlust commented on May 18, 2024

Ludwig saves both the latest weights and the best weights. You can turn off saving the latest weights with skip_save_progress_weights, but you can't turn off saving the best weights, because, well, that's the whole purpose of training. Why would you want not to save the best weights?

from ludwig.

athlonshi avatar athlonshi commented on May 18, 2024

I meant by saving the best weight to the disk. Isn't it what it does? @w4nderlust

from ludwig.

athlonshi avatar athlonshi commented on May 18, 2024

Thanks, this will improve the performance for small training problems. Also, I've sent you a jupyter notebook and test data set for a regression problem showing the performance issue. What matters more is the regression results are so different between Ludwig and Keras with the same parameter setting, in which Keras results make more sense. If you want me open a new issue to describe it, I can. But I suspect it could be due to the way how biases are initialized. @w4nderlust

Yes, it saves the best weights if the performance on validation improves. I can think about adding a --skip_save_best_model parameter, but then the model will never be saved. There could be some use cases where that could be useful (hyperparameters search for instance). I may also try to find a way to keep the best model weights copied in memory and just saved at the end. Will add this to the list of enhancements.

from ludwig.

w4nderlust avatar w4nderlust commented on May 18, 2024

Those are two separate issues (speed and predictions), but will look at both of them, no need to open another issue. Thanks so much for the effort.

from ludwig.

athlonshi avatar athlonshi commented on May 18, 2024

Thanks a lot. Just a bit more information. I have also tried to fake a linear regression problem with Y = a * X + b + c*normal(0, 1) and create a single neuron/single layer for training. The results are very sensitive to how I initialize the network.

Those are two separate issues (speed and predictions), but will look at both of them, no need to open another issue. Thanks so much for the effort.

from ludwig.

w4nderlust avatar w4nderlust commented on May 18, 2024

So with the pushed fix the speed issue is in large part solved. Now there are parameters to turn off saving of intermediate models, progress and logs. Now on small models the speed compared to native TF is comparable (still slightly slower because of keeping track of statistics and generation of placeholders, but pretty close), while for bigger models, bigger datasets and bigger batch sizes the difference is negligible.

Still working on the prediction issue.

from ludwig.

athlonshi avatar athlonshi commented on May 18, 2024

from ludwig.

w4nderlust avatar w4nderlust commented on May 18, 2024

The regression isntability issue should have been solved. Please reopen if it is still an issue.

from ludwig.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.