Giter Club home page Giter Club logo

Comments (5)

tlbtlbtlb avatar tlbtlbtlb commented on May 13, 2024

I think that's a very interesting kind of environment.

David Duvenaud has done interesting work in this area. His paper Gradient-based Hyperparameter Optimization with Reversible Learning [arXiv, slides] describes one approach.

Check out this graph from the above paper, plotting loss vs the learning rate hyperparameter:

image

I think that sort of curve is common. If you start on the left, a simple convex optimizer might find its way down to the minimum. But if you start on the right, chaos. That's where prior knowledge is helpful, and where your idea of transfer learning from other domains might be a big win.

One issue is that if the steps are very slow, it may be frustrating to test. So it'd be good to have some nontrivial networks that can converge in a fraction of a second for agent development purposes.

Another thought: the use case is different for model training in production vs research. In production, you care about improving convergence time and minimizing overfitting. In research many attempts don't converge at all, and you don't know if the model is wrong, the problem is unsolvable, or your hyperparameters need a slight tweak. I assume you're mainly thinking about the production case, but the research case would also be really interesting.

Anyway, we'd be delighted if you would contribute some hyperparameter choosing environments. I'm happy to answer questions about integrating into gym.

from gym.

ilyasu123 avatar ilyasu123 commented on May 13, 2024

It's an interesting idea that's worth exploring. Several recommendations:

  1. It would be much more interesting if the datasets were not random; for example, using the penn-treebank, MNIST, CIFAR-10, and a few more like them would be much better.
  2. At the beginning of each episode, the first observation should represent the dataset, the architecture, and a depiction of all the different hyperparameters.
  3. It would be good to set things up so that each step would take no more than 10-20 minutes, at least for the early versions.
    If all three points are done, it could become an interesting environment that would easily defeat pretty much all RL algorithms.

from gym.

iaroslav-ai avatar iaroslav-ai commented on May 13, 2024

Thank you everyone for your informative responses! I will convert my existing code into environment and test it. As I will have something interesting, I will let you know.

@tlbtlbtlb:
Thanks for the fascinating reference! Indeed, maybe some years from now training of deep nets will be done in completely new way to accomplish better hyperparameter selection.

I agree that training steps should not take too much time. Therefore for first environments I think it makes sense to use small datasets (e.g. from uci ml dataset repository) and “classical” ml models from sklearn (which usually do not take much time to train).

Different use cases of industry / academia is definitely something to consider. I think objective of the environment should be engineered accordingly; For one environment performance measure would be proportional to speed of convergence, complexity of the model and achieved validation accuracy, and for the other best achieved performance and fraction of successfully converged trials. Also for academic applications control over the model should be more fine grained.

@ilyasu123:

  1. Indeed, in the end I think it only makes sense to use real datasets, as otherwise results might not be transferable into practice. Using artificial data makes sense only for “Hello World” environment of this kind, to have it as an example for more complicated ones.
  2. Right. For initial environments I will however have set of model parameters optimized fixed per environment, to keep things simple.
  3. The most expensive step (on modified MNIST dataset) in my current code usually takes less than 1 min, with linear SVM. It might make sense to consider a subset of MNIST to use some more complicated models, or use small datasets from uci ml dataset repository.

from gym.

ilyasu123 avatar ilyasu123 commented on May 13, 2024

@iaroslav-ai

  1. +1
  2. OK
  3. A linear SVM is a good starting point, but you to make it interesting, even a neural net with 30 hidden units would be, most likely, a lot closer than a linear SVM.

from gym.

iaroslav-ai avatar iaroslav-ai commented on May 13, 2024

I think this issue can be closed now.
As for the environments that I committed, I think the next step would be to make versions with different objectives, to represent different use cases as discussed with tlbtlbtlb.

from gym.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.