dxyang / dqn_pytorch Goto Github PK
View Code? Open in Web Editor NEWVanilla DQN, Double DQN, and Dueling DQN implemented in PyTorch
Vanilla DQN, Double DQN, and Dueling DQN implemented in PyTorch
Thanks for your wonderful code, I only use the logic of your trainig part in my code. But I fount that the effect of my model gets worse with training. When I edit
Line 225 in 43fe371
clipped_error = 1.0 * bellman_error
the model works well.Hi If i run the code for breakout, i am getting the following error.
Traceback (most recent call last):
File "main.py", line 120, in
main()
File "main.py", line 117, in main
atari_learn(env, task.env_id, num_timesteps=task.max_timesteps, double_dqn=double_dqn, dueling_dqn=dueling_dqn)
File "main.py", line 72, in atari_learn
dueling_dqn=dueling_dqn
File "/home/ashutosh/repos/DQN_pytorch/learn.py", line 229, in dqn_learning
q_s_a.backward(clipped_error.data.unsqueeze(1))
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: invalid argument 3: Index tensor must have same dimensions as input tensor at /pytorch/torch/lib/THC/generic/THCTensorScatterGather.cu:199
Traceback (most recent call last):
File "main.py", line 120, in
main()
File "main.py", line 117, in main
atari_learn(env, task.env_id, num_timesteps=task.max_timesteps, double_dqn=double_dqn, dueling_dqn=dueling_dqn)
File "main.py", line 72, in atari_learn
dueling_dqn=dueling_dqn
File "/home/sotirisnik/other/DQN_pytorch/learn.py", line 149, in dqn_learning
obs, reward, done, info = env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 132, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 124, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 279, in _step
return self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 93, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 279, in _step
return self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 53, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/wrappers/monitoring.py", line 33, in _step
observation, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/wrappers/time_limit.py", line 36, in _step
observation, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/envs/atari/atari_env.py", line 73, in _step
action = self._action_set[a]
IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
[2018-06-27 02:58:15,904] Finished writing results. You can upload them to the scoreboard via gym.upload('/home/sotirisnik/other/DQN_pytorch/tmp/BreakoutNoFrameskip-v4')
I know Gym has moved to Gymnasium, but some functions change... I tried to make the transition but i am not getting it. Is it possible you to update so we can test your amazing work? I would be very pleased if so. Thanks
Thanks for offering this wonderful code. But I have a question.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.