Comments (7)
Hi, thanks for raising this issue.
Before you start to write code, let us have a discussion to see if the PR is necessary.
The following are my thoughts for discussion:
- Your argument makes sense. How about this solution? In training, we pass the training task and the validation task to the trainer. In tests, we pass the test task as both the train and the test tasks and also set
demo_mode=True
(ref). This will tell the trainer to only evaluate the policy on the test task and no training is done. - If my proposed solution in 1 seems reasonable to you, in tests the best model is equivalent to the last model and the problem is solved.
I personally don't like the idea of early stopping, and the trainer saves the model snapshot with the best validation score. The mapping between this model and the training iteration can be traced in the training log.
from evojax.
Hey! Thanks for the quick response.
I agree with your points, it's definitely possible to keep the current interface. This usage pattern should be documented somewhere tho.
re: early stopping - it can be necessary in some scenarios. The model is indeed being saved after every iteration, but the trainer.run
method doesn't return the information about the best model. You can find it in the logs, sure, but there has to be a programmatic way to do it. Otherwise it's impossible to load the best model automatically. Maybe a solution here is to log the model based on val_score
, not on train_score
. Then the models/best.npz
model would mean "model with best validation score". What do you think about it?
from evojax.
Maybe a solution here is to log the model based on val_score, not on train_score. Then the models/best.npz model would mean "model with best validation score". What do you think about it?
I made sure the best model (its log too) was based on the validation score (related source code), can you double check this part?
from evojax.
Oh, you are right, sorry for the confusion.
from evojax.
@lerrytang so what about early stopping? can we add an early stopping patience and threshold parameters to the trainer? E.g. stop the training loop if the test score doesn't improve by over the last iterations
from evojax.
While it is common in to use early stopping in supervised learning problems, early stopping is misleading in solving tasks with neuroevolution (from my experience). For example, one often sees the learning curve (test scores) dip for quite a long time before rising up when training a locomotion controller. You may think we can put a knob on how many iterations we should tolerate before we see any progress, but I don't think this extra hyper-parameter is worth the trouble.
from evojax.
So basically you are saying it's always better to run the ES for a lot of iterations? I'll do some reward/iter plotting for my problem that involves timeseries (which means my train and validation data might come from different distributions). It's probably very data-dependent. Will tell you the results once we add the custom log_fn :)
from evojax.
Related Issues (20)
- Minor issue with GIF at the end of the Abstract Paintings notebook 1 HOT 2
- Evaluating brax environments other than brax-ant. Terminates with error. HOT 5
- can one specify parts of the model that are non differentiable? HOT 2
- Reproducing benchmark scores HOT 2
- Look into Problems with Brax HOT 2
- Evolving topology of NN HOT 3
- high dimensional parametric search HOT 5
- AssertionError for OpenES HOT 4
- A GNN-based Meta-Learning Method for Sparse Portfolio Optimization HOT 3
- How can I continue training from previously saved parameters? HOT 1
- Issue with BatchNorm layer HOT 1
- Reinitialization HOT 1
- Multi-Agent RL Environment for CrowdSim, Predator-Prey, and Army HOT 2
- Question: Requirement on vec. task
- [Discussion] Sequencing side-effects in JAX
- Can't execute Brax notebook
- OpenAI Gym Integration
- How to use GPU for computing HOT 1
- How to design custom Seq2seq model by evojax?
- SlimeVolley initial ball velocity is incorrect
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from evojax.