Comments (11)
- 3200% seems to be the maximum allowed value by fceux.
- Not sure how to skip the intro screens faster, I'm starting fceux, then forcing a value in the world and level memory addresses, skipping frames until the timer starts to decrease, then I'm establishing the pipe with python.
Maximum efficiency could be achieved by coding directly in fceux (lua), but that would make it incompatible with gym / python.
Another option is probably to run iterations in parallel and update weights on a central server.
from gym-super-mario.
Hi Philip, we're still having some trouble achieving enough training iterations in a reasonable amount of time. The RL methods we're using are pretty standard (Q-learning, SARSA, approximate Q learning), and they would be difficult to parallelize, since they require iterative updates that depend on previous calculations done in serial. The training time issue is very big for us, since we need to test out several different algorithm variations and hyperparameter configurations in order to write our final paper for our course at Harvard.
If the emulator speed is already maxed out, we should look into ways we can decrease the amount of time spent on the intro screens. Is there any way to skip all frames before the timer starts? Another approach to consider would be to keep the emulator open for the entire duration of the training, and manually reset the number of lives to 3x after every life. That way, you would only have to establish the pipe with python once during the whole training sequence. What do you think?
from gym-super-mario.
I'll try to see if I can skip the intro by saving the memory state.
What kind of % improvement do you need vs the current speed?
Should I only optimize the tiles version?
from gym-super-mario.
Currently, it takes approx. 4500s = 75 mins to train 100 iterations on World 1-3. That particular level has a cliff right at the beginning, so Mario usually dies very quickly, which means that the training speed we achieved of 45s / iteration on that level is probably a best case scenario. In order to make it serviceable, we'd ideally like to see a 10x increase in training speed, which would allow us to get close to 1000 iterations per hour. We would need that kind of speed in order to test out different combinations of hyperparameters of our model.
We're only using the tiles version, so from our perspective, it's fine if you'd like to focus on optimizing that. Thank you again for your efforts.
from gym-super-mario.
I should have something ready by Tuesday or Wednesday.
from gym-super-mario.
OpenAI released 'Universe' today, a way to convert any game to a gym env through a docker container (communication is done through VNC).
I'll do a quick patch for you, but I'll probably need to make this env compatible with Universe in the future
Universe also has a A3C (asynchronous advantage actor-critic) learning algo available that can be run across a cluster. (see https://github.com/openai/universe-starter-agent).
from gym-super-mario.
Pushed the fix to the 'gabegrand' branch. Mario is on steroid.
- The info var now returns an 'iteration' key, that is increased when the level is restarted.
- You don't need to call reset(), except to first initialize the env
- To check if Mario has completed the level, check the value of the distance key when the iteration key is increased. The flag pole is 40 'meters' before the castle. The castle distance are here. (e.g. if Mario reaches 2474 (2514 - 40) in level 1-3, he successfully completed the level).
Here is a quick python script that works for me:
import gym
import ppaquette_gym_super_mario
env = gym.make('ppaquette/SuperMarioBros-1-3-Tiles-v0')
env.reset()
curr_iter = 1;
max_iter = 2;
while curr_iter <= max_iter:
action = env.action_space.sample()
obs, rew, done, info = env.step(action)
if (info['iteration'] > curr_iter):
print('Max Distance Achieved', info['distance'])
curr_iter = info['iteration'];
env.close()
from gym-super-mario.
Hi Philip, thanks for the fix. I see that the number of lives now starts at 9x, and that the info var / iteration key is behaving as expected. However, I'm still not really seeing an increase in the game speed - it seems to be running at roughly the same speed as before. Are you seeing significant speedup in the framerate on your end?
from gym-super-mario.
I just ran 100 episodes (random actions) on level 1-3, and it took 391.89 seconds (so ~ 900 episodes / hour).
Try running it in a cloud VM and compare it to my benchmark using random actions.
- The game saves an initial state when first loaded, and reloads that state when Mario dies (much faster then killing and restarting fceux at every iteration), which should give roughly 2x increase
- The game repeats every action for 6 frames (1 processed, 6 repeated - Used to be 1 processed, 1 repeated), which should give roughly 5-6x increase.
from gym-super-mario.
On World 1-3, running on my machine, the script you provided took 910.974s for 100 episodes. That's not quite up to what you recorded, but there is definitely some speedup from the previous version. Also, we no longer have to close and re-open the emulator every time, which is nice.
I have a couple questions / comments about the new code:
-
Previously, we had written our code to duplicate actions for a certain number of frames (otherwise, Mario's behavior is too frantic/jumpy since he is constantly taking actions). However, you mentioned that the game now repeats every action for 6 frames. I'm wondering whether we should now remove this behavior from our code, to avoid repeating actions for too many frames?
-
Does the
done
variable inobs, rew, done, info = env.step(action)
ever returnTrue
? Or do we need to just replace alldone
conditions withif (info['iteration'] > curr_iter)
? -
env.close()
seems to be not working. The emulator just beachballs and doesn't close.
from gym-super-mario.
- Yes you should remove the skip actions from your code. If you want to adjust the value, just edit this line: https://github.com/ppaquette/gym-super-mario/blob/gabegrand/ppaquette_gym_super_mario/lua/super-mario-bros.lua#L48
done
will always be false, since reset() doesn't need to be called. You need to replace alldone
withinfo['iteration'] > curr_iter
- Just kill the fceux process, or press ctrl c and close it manually
from gym-super-mario.
Related Issues (20)
- Error UnregisteredEnv('No registered env with id: {}'.format(id)) HOT 1
- Env fails when creating temp lua file
- Emulator freeze HOT 2
- How about adding a function to move mario to a specified position?
- Training time of openai gym is too long, what is the good way to accelerate it? HOT 2
- How train Mario game with OpenAI gym in the cloud, and download this agent, and run this agent on myself computer ? HOT 3
- Modifications required to fit latest gym version HOT 4
- gym (0.9.3),,ImportError: No module named client, from gym.scoreboard.client HOT 5
- will it compatable gym latest version? HOT 1
- Install Without `gym-pull` HOT 3
- Mario Distance HOT 1
- NotImplementedError HOT 1
- Type Error 'int object is not subscriptable' HOT 1
- AttributeError: 'MetaSuperMarioBrosEnv' object has no attribute 'disable_out_pipe'
- Unusually large reward on first step
- Run on windows HOT 1
- Action Space doesn't allow sampling for random starts
- Error in gym-super-mario-bros package
- Only returning pixel data not Tile Data
- Cannot disable GUI for FCEUX HOT 19
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gym-super-mario.