Hi Philip, I was wondering whether it's possible to manually set the emulator speed. I

3200% seems to be the maximum allowed value by fceux. Not sure how to s

On World 1-3, running on my machine, the you provided took 910.974s for 100 epi

Controlling emulator speed about gym-super-mario HOT 11 CLOSED

ppaquette commented on August 11, 2024

Controlling emulator speed

from gym-super-mario.

Comments (11)

ppaquette commented on August 11, 2024

3200% seems to be the maximum allowed value by fceux.
Not sure how to skip the intro screens faster, I'm starting fceux, then forcing a value in the world and level memory addresses, skipping frames until the timer starts to decrease, then I'm establishing the pipe with python.

Maximum efficiency could be achieved by coding directly in fceux (lua), but that would make it incompatible with gym / python.

Another option is probably to run iterations in parallel and update weights on a central server.

from gym-super-mario.

gabegrand commented on August 11, 2024

Hi Philip, we're still having some trouble achieving enough training iterations in a reasonable amount of time. The RL methods we're using are pretty standard (Q-learning, SARSA, approximate Q learning), and they would be difficult to parallelize, since they require iterative updates that depend on previous calculations done in serial. The training time issue is very big for us, since we need to test out several different algorithm variations and hyperparameter configurations in order to write our final paper for our course at Harvard.

If the emulator speed is already maxed out, we should look into ways we can decrease the amount of time spent on the intro screens. Is there any way to skip all frames before the timer starts? Another approach to consider would be to keep the emulator open for the entire duration of the training, and manually reset the number of lives to 3x after every life. That way, you would only have to establish the pipe with python once during the whole training sequence. What do you think?

from gym-super-mario.

ppaquette commented on August 11, 2024

I'll try to see if I can skip the intro by saving the memory state.

What kind of % improvement do you need vs the current speed?
Should I only optimize the tiles version?

from gym-super-mario.

gabegrand commented on August 11, 2024

Currently, it takes approx. 4500s = 75 mins to train 100 iterations on World 1-3. That particular level has a cliff right at the beginning, so Mario usually dies very quickly, which means that the training speed we achieved of 45s / iteration on that level is probably a best case scenario. In order to make it serviceable, we'd ideally like to see a 10x increase in training speed, which would allow us to get close to 1000 iterations per hour. We would need that kind of speed in order to test out different combinations of hyperparameters of our model.

We're only using the tiles version, so from our perspective, it's fine if you'd like to focus on optimizing that. Thank you again for your efforts.

from gym-super-mario.

ppaquette commented on August 11, 2024

I should have something ready by Tuesday or Wednesday.

from gym-super-mario.

ppaquette commented on August 11, 2024

OpenAI released 'Universe' today, a way to convert any game to a gym env through a docker container (communication is done through VNC).

I'll do a quick patch for you, but I'll probably need to make this env compatible with Universe in the future

Universe also has a A3C (asynchronous advantage actor-critic) learning algo available that can be run across a cluster. (see https://github.com/openai/universe-starter-agent).

from gym-super-mario.

ppaquette commented on August 11, 2024

Pushed the fix to the 'gabegrand' branch. Mario is on steroid.

The info var now returns an 'iteration' key, that is increased when the level is restarted.
You don't need to call reset(), except to first initialize the env
To check if Mario has completed the level, check the value of the distance key when the iteration key is increased. The flag pole is 40 'meters' before the castle. The castle distance are here. (e.g. if Mario reaches 2474 (2514 - 40) in level 1-3, he successfully completed the level).

Here is a quick python script that works for me:

import gym
import ppaquette_gym_super_mario

env = gym.make('ppaquette/SuperMarioBros-1-3-Tiles-v0')
env.reset()

curr_iter = 1;
max_iter = 2;
while curr_iter <= max_iter:
    action = env.action_space.sample()
    obs, rew, done, info = env.step(action)
    if (info['iteration'] > curr_iter):
      print('Max Distance Achieved', info['distance'])
      curr_iter = info['iteration'];

env.close()

from gym-super-mario.

gabegrand commented on August 11, 2024

Hi Philip, thanks for the fix. I see that the number of lives now starts at 9x, and that the info var / iteration key is behaving as expected. However, I'm still not really seeing an increase in the game speed - it seems to be running at roughly the same speed as before. Are you seeing significant speedup in the framerate on your end?

from gym-super-mario.

ppaquette commented on August 11, 2024

I just ran 100 episodes (random actions) on level 1-3, and it took 391.89 seconds (so ~ 900 episodes / hour).

Output log

Try running it in a cloud VM and compare it to my benchmark using random actions.

The game saves an initial state when first loaded, and reloads that state when Mario dies (much faster then killing and restarting fceux at every iteration), which should give roughly 2x increase
The game repeats every action for 6 frames (1 processed, 6 repeated - Used to be 1 processed, 1 repeated), which should give roughly 5-6x increase.

from gym-super-mario.

gabegrand commented on August 11, 2024

On World 1-3, running on my machine, the script you provided took 910.974s for 100 episodes. That's not quite up to what you recorded, but there is definitely some speedup from the previous version. Also, we no longer have to close and re-open the emulator every time, which is nice.

I have a couple questions / comments about the new code:

Previously, we had written our code to duplicate actions for a certain number of frames (otherwise, Mario's behavior is too frantic/jumpy since he is constantly taking actions). However, you mentioned that the game now repeats every action for 6 frames. I'm wondering whether we should now remove this behavior from our code, to avoid repeating actions for too many frames?
Does the done variable in obs, rew, done, info = env.step(action) ever return True? Or do we need to just replace all done conditions with if (info['iteration'] > curr_iter)?
env.close() seems to be not working. The emulator just beachballs and doesn't close.

from gym-super-mario.

ppaquette commented on August 11, 2024

Yes you should remove the skip actions from your code. If you want to adjust the value, just edit this line: https://github.com/ppaquette/gym-super-mario/blob/gabegrand/ppaquette_gym_super_mario/lua/super-mario-bros.lua#L48
done will always be false, since reset() doesn't need to be called. You need to replace all done with info['iteration'] > curr_iter
Just kill the fceux process, or press ctrl c and close it manually

from gym-super-mario.

Controlling emulator speed about gym-super-mario HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent