Giter Club home page Giter Club logo

Comments (7)

lerrytang avatar lerrytang commented on August 23, 2024

Hi, thanks for raising this issue.
Before you start to write code, let us have a discussion to see if the PR is necessary.
The following are my thoughts for discussion:

  1. Your argument makes sense. How about this solution? In training, we pass the training task and the validation task to the trainer. In tests, we pass the test task as both the train and the test tasks and also set demo_mode=True (ref). This will tell the trainer to only evaluate the policy on the test task and no training is done.
  2. If my proposed solution in 1 seems reasonable to you, in tests the best model is equivalent to the last model and the problem is solved.

I personally don't like the idea of early stopping, and the trainer saves the model snapshot with the best validation score. The mapping between this model and the training iteration can be traced in the training log.

from evojax.

danielgafni avatar danielgafni commented on August 23, 2024

Hey! Thanks for the quick response.

I agree with your points, it's definitely possible to keep the current interface. This usage pattern should be documented somewhere tho.

re: early stopping - it can be necessary in some scenarios. The model is indeed being saved after every iteration, but the trainer.run method doesn't return the information about the best model. You can find it in the logs, sure, but there has to be a programmatic way to do it. Otherwise it's impossible to load the best model automatically. Maybe a solution here is to log the model based on val_score, not on train_score. Then the models/best.npz model would mean "model with best validation score". What do you think about it?

from evojax.

lerrytang avatar lerrytang commented on August 23, 2024

Maybe a solution here is to log the model based on val_score, not on train_score. Then the models/best.npz model would mean "model with best validation score". What do you think about it?

I made sure the best model (its log too) was based on the validation score (related source code), can you double check this part?

from evojax.

danielgafni avatar danielgafni commented on August 23, 2024

Oh, you are right, sorry for the confusion.

from evojax.

danielgafni avatar danielgafni commented on August 23, 2024

@lerrytang so what about early stopping? can we add an early stopping patience and threshold parameters to the trainer? E.g. stop the training loop if the test score doesn't improve by over the last iterations

from evojax.

lerrytang avatar lerrytang commented on August 23, 2024

While it is common in to use early stopping in supervised learning problems, early stopping is misleading in solving tasks with neuroevolution (from my experience). For example, one often sees the learning curve (test scores) dip for quite a long time before rising up when training a locomotion controller. You may think we can put a knob on how many iterations we should tolerate before we see any progress, but I don't think this extra hyper-parameter is worth the trouble.

from evojax.

danielgafni avatar danielgafni commented on August 23, 2024

So basically you are saying it's always better to run the ES for a lot of iterations? I'll do some reward/iter plotting for my problem that involves timeseries (which means my train and validation data might come from different distributions). It's probably very data-dependent. Will tell you the results once we add the custom log_fn :)

from evojax.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.