I have a suggestion of environment(s), where I can contribute. Consi

I think that's a very interesting kind of environment. <a href="http

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Knowledge transfer for black box optimization environment about gym HOT 5 CLOSED

openai commented on May 13, 2024

Knowledge transfer for black box optimization environment

from gym.

Comments (5)

tlbtlbtlb commented on May 13, 2024

I think that's a very interesting kind of environment.

David Duvenaud has done interesting work in this area. His paper Gradient-based Hyperparameter Optimization with Reversible Learning [arXiv, slides] describes one approach.

Check out this graph from the above paper, plotting loss vs the learning rate hyperparameter:

I think that sort of curve is common. If you start on the left, a simple convex optimizer might find its way down to the minimum. But if you start on the right, chaos. That's where prior knowledge is helpful, and where your idea of transfer learning from other domains might be a big win.

One issue is that if the steps are very slow, it may be frustrating to test. So it'd be good to have some nontrivial networks that can converge in a fraction of a second for agent development purposes.

Another thought: the use case is different for model training in production vs research. In production, you care about improving convergence time and minimizing overfitting. In research many attempts don't converge at all, and you don't know if the model is wrong, the problem is unsolvable, or your hyperparameters need a slight tweak. I assume you're mainly thinking about the production case, but the research case would also be really interesting.

Anyway, we'd be delighted if you would contribute some hyperparameter choosing environments. I'm happy to answer questions about integrating into gym.

from gym.

ilyasu123 commented on May 13, 2024

It's an interesting idea that's worth exploring. Several recommendations:

It would be much more interesting if the datasets were not random; for example, using the penn-treebank, MNIST, CIFAR-10, and a few more like them would be much better.
At the beginning of each episode, the first observation should represent the dataset, the architecture, and a depiction of all the different hyperparameters.
It would be good to set things up so that each step would take no more than 10-20 minutes, at least for the early versions.
If all three points are done, it could become an interesting environment that would easily defeat pretty much all RL algorithms.

from gym.

iaroslav-ai commented on May 13, 2024

Thank you everyone for your informative responses! I will convert my existing code into environment and test it. As I will have something interesting, I will let you know.

@tlbtlbtlb:
Thanks for the fascinating reference! Indeed, maybe some years from now training of deep nets will be done in completely new way to accomplish better hyperparameter selection.

I agree that training steps should not take too much time. Therefore for first environments I think it makes sense to use small datasets (e.g. from uci ml dataset repository) and “classical” ml models from sklearn (which usually do not take much time to train).

Different use cases of industry / academia is definitely something to consider. I think objective of the environment should be engineered accordingly; For one environment performance measure would be proportional to speed of convergence, complexity of the model and achieved validation accuracy, and for the other best achieved performance and fraction of successfully converged trials. Also for academic applications control over the model should be more fine grained.

@ilyasu123:

Indeed, in the end I think it only makes sense to use real datasets, as otherwise results might not be transferable into practice. Using artificial data makes sense only for “Hello World” environment of this kind, to have it as an example for more complicated ones.
Right. For initial environments I will however have set of model parameters optimized fixed per environment, to keep things simple.
The most expensive step (on modified MNIST dataset) in my current code usually takes less than 1 min, with linear SVM. It might make sense to consider a subset of MNIST to use some more complicated models, or use small datasets from uci ml dataset repository.

from gym.

ilyasu123 commented on May 13, 2024

@iaroslav-ai

+1
OK
A linear SVM is a good starting point, but you to make it interesting, even a neural net with 30 hidden units would be, most likely, a lot closer than a linear SVM.

from gym.

iaroslav-ai commented on May 13, 2024

I think this issue can be closed now.
As for the environments that I committed, I think the next step would be to make versions with different objectives, to represent different use cases as discussed with tlbtlbtlb.

from gym.

Knowledge transfer for black box optimization environment about gym HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent