Greetings.
Thank you very much for your work on porting the original TF2 implementation to pytorch, as well as open sourcing it.
I have recently been looking over this kind of project that port Dreamer-v1 / v2 to Pytorch for some experiment in my research, hence my interest in your code.
I was wondering if you had some insight on the follow problems I ran into when trying to run the code from this repository.
1. 16 bit precision error.
As a first try, I attempted to execute the default dreamer agaent on Atari's Pong, running the same command as in the original implementation: python3 dreamer.py --logdir ~/logdir/atari_pong/dreamerv2/1 --configs defaults atari --task atari_pong
.
However, after the experience buffer was pre-filled and training is about to start, the scirpts returns the following error which seems to be related to Torch's AMP, which I understand as being related to the mixed precision computation:
(dreamer-torch) dreamer-torch$ python dreamer.py --logdir ./logdir/atari_pong/dreamerv2/2 --configs defaults atari --task atari_pong
Logdir logdir/atari_pong/dreamerv2/2
Create envs.
Prefill dataset (0 steps).
Eval episode has 934 steps and return -20.0.
[201372] eval_return -20.0 / eval_length 934.0 / eval_episodes 1.0
Invalid MIT-MAGIC-COOKIE-1 keySimulate agent.
[201372]
Start evaluation.
Eval episode has 832 steps and return -20.0.
[201372] eval_return -20.0 / eval_length 832.0 / eval_episodes 1.0
Start training.
[201372] fps 0.0
Traceback (most recent call last):
File "dreamer.py", line 317, in <module>
main(parser.parse_args(remaining))
File "dreamer.py", line 295, in main
state = tools.simulate(agent, train_envs, config.eval_every, state=state)
File "/home/d055/random/rl/dreamer-torch/tools.py", line 136, in simulate
action, agent_state = agent(obs, done, agent_state, reward)
File "dreamer.py", line 75, in __call__
self._train(next(self._dataset))
File "dreamer.py", line 144, in _train
metrics.update(self._task_behavior._train(start, reward)[-1])
File "/home/d055/random/rl/dreamer-torch/models.py", line 207, in _train
metrics.update(self._actor_opt(actor_loss, self.actor.parameters()))
File "/home/d055/random/rl/dreamer-torch/tools.py", line 481, in __call__
self._scaler.step(self._opt)
File "/home/d055/anaconda3/envs/dreamer-torch/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 337, in step
assert len(optimizer_state["found_inf_per_device"]) > 0, "No inf checks were recorded for this optimizer."
AssertionError: No inf checks were recorded for this optimizer.
Do you happen to have any idea on why this might be ?
I am using Python 3.8.12, and installed the dependencies using pip install -r requirements.txt
provided in the repository.
2. Little out of topic issue: using 32 bit precision
I tried to play with the precision the scripts use for their computation by setting --precision 32
(to my understanding, this is the default precision of Pytorch), and the code does end up running without the previous error.
However, even after training the agent for around 16 millions steps, there is not improvement in performance, unlike the curves you have provided for Atari pong.
I was thus wondering if the change of the precision might have been the reason for this, and would be interested to know your opinion about it, as well as any recommendation for running the code in this repository to at least reproduce the original paper's result.
Thank you very much for your time.
Looking forward to hear back from you.
Best regards.