Giter Club home page Giter Club logo

rwg_benchmarking's People

Contributors

declanoller avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rwg_benchmarking's Issues

Simple variations: softmax outputs, no nonlinearities, etc.

There are several slight variations that it would make sense to test. For example, now, for discrete action spaces, I just use an argmax across the outputs. It's possible that a softmax would be more effective for some env's.

Similarly, we have a nonlinearity right now, but it's possible that's not necessary for some env's (see winning agents for LunarLander-v2 and CartPole-v0 here: https://www.declanoller.com/2019/01/25/beating-openai-games-with-neuroevolution-agents-pretty-neat/ ; completely linear).

More broadly: make it so many variations can be tested for each.

Add distributional stats

It's not as informative to only have a single solve time/avg score, due to randomness. It would be better, for benchmarking, to run an ensemble of the agents, and form a distribution. Even 10 of them would let us get a sense of the spread.

Make benchmark happen less frequently

I'm currently testing to see if an agent has reached "benchmark level" by doing the following. Every agent produced is tested for N (usually 3) episodes, and the mean score is taken. Then, if that mean score is better than the best mean score found so far (from previous agents), the agent is tested for 100 episodes (typically) to see if it produces the "benchmark 100 episode" score.

However, this is slowing it down too much -- if the 3 episode average is the best found, but still far below the benchmark score, it doesn't make much sense to test it for that long. I should do something more like, if an agent's 3 episode mean is >=80% of the benchmark score, then try it. That can be fine tuned more but should speed it up.

Add option of using static random seed

There's randomness in the initial conditions of many env's that affects the outcomes. To be more systematic, it makes sense sometimes to specify a random seed value so runs at different times can be compared.

Hopefully this would be overcome by gathering enough statistics (trials), but it should be done anyway.

Save solved episode recording

It would be good to add a recording of an episode with the best weights found, for each env.

However, last I checked, there's a very annoying bug with gym with recording multiple episodes, or having multiple env's being monitored. Solve this one. It's possible it may need to be done in a hacky way, after the main optimization runs.

Additionally, maybe add "grid" style ones like this: https://www.declanoller.com/2019/01/25/beating-openai-games-with-neuroevolution-agents-pretty-neat/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.