I have been trying to get your a3c implementation to work for atari games using the NIPS/NATURE network.
Using the network initialization from dqn and a few modification to the a3c.py file I keep getting an error on line:282.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "a3c_atari.py", line 139, in run
agent.fit(gym.make(ENV_NAME), nb_steps=1750000, visualize=False, verbose=verbose,callbacks = callbacks)
File "/home/pavitrakumar/Desktop/keras-rl-master-a3c-files/core.py", line 125, in fit
metrics = self.backward(reward, terminal=done)
File "/home/pavitrakumar/Desktop/keras-rl-master-a3c-files/a3c.py", line 309, in backward
means, variances = self.actor_train_fn(inputs)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py", line 811, in __call__
return self.function(*inputs)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
outputs = self.fn()
ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices start at 0) has shape[3] == 32, but the output's size on that axis is 12288.
Apply node that caused the error: GpuElemwise{Composite{(i0 + (i1 * (i2 / i3)))}}[(0, 2)](GpuElemwise{Composite{(i0 * ((i1 * i2) + (i3 * i4)))}}[(0, 1)].0, CudaNdarrayConstant{[[[[ 9.99999975e-05]]]]}, Rebroadcast{?,?,0}.0, GpuFromHost.0)
Toposort index: 321
Inputs types: [CudaNdarrayType(float32, (True, True, True, False)), CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True, True, True))]
Inputs shapes: [(1, 1, 1, 12288), (1, 1, 1, 1), (8, 8, 1, 32), (1, 1, 1, 1)]
Inputs strides: [(0, 0, 0, 1), (0, 0, 0, 0), (256, 32, 0, 1), (0, 0, 0, 0)]
Inputs values: ['not shown', CudaNdarray([[[[ 9.99999975e-05]]]]), 'not shown', CudaNdarray([[[[ 12.]]]])]
Outputs clients: [[GpuElemwise{Composite{((i0 * i1) + (i2 * sqr((-i3))))}}[(0, 1)](GpuDimShuffle{x,x,x,x}.0, <CudaNdarrayType(float32, 4D)>, GpuDimShuffle{x,x,x,x}.0, GpuElemwise{Composite{(i0 + (i1 * (i2 / i3)))}}[(0, 2)].0), GpuElemwise{Composite{((i0 * i1) + (i2 * i3 * i4))}}[(0, 1)](GpuDimShuffle{x,x,x,x}.0, <CudaNdarrayType(float32, 4D)>, CudaNdarrayConstant{[[[[-1.]]]]}, GpuDimShuffle{x,x,x,x}.0, GpuElemwise{Composite{(i0 + (i1 * (i2 / i3)))}}[(0, 2)].0), GpuElemwise{Mul}[(0, 2)](CudaNdarrayConstant{[[[[-1.]]]]}, GpuDimShuffle{x,x,x,x}.0, GpuElemwise{Composite{(i0 + (i1 * (i2 / i3)))}}[(0, 2)].0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
WINDOW_LENGTH = 1
INPUT_SHAPE = (84,84)
input_shape = (WINDOW_LENGTH,) + INPUT_SHAPE
#actor_input = Input(shape=env.observation_space.shape)
if K.image_dim_ordering() == 'tf':
# (width, height, channels)
actor_input = Input(shape=(input_shape[1],input_shape[2],input_shape[0]))
critic_input = Input(shape=(input_shape[1],input_shape[2],input_shape[0]))
elif K.image_dim_ordering() == 'th':
# (channels, width, height)
actor_input = Input(shape=(input_shape[0],input_shape[1],input_shape[2]))
critic_input = Input(shape=(input_shape[0],input_shape[1],input_shape[2]))
else:
raise RuntimeError('Unknown image_dim_ordering.')
x = None
x = Convolution2D(32, 8, 8, subsample=(4, 4))(actor_input)
x = Activation('relu')(x)
x = Convolution2D(64, 4, 4, subsample=(2, 2))(x)
x = Activation('relu')(x)
x = Convolution2D(64, 3, 3, subsample=(1, 1))(x)
x = Activation('relu')(x)
x = Flatten()(x)
x = Dense(256)(x)
x = Activation('relu')(x)
actor_mean_output = Dense(nb_actions)(x)
actor_mean_output = Activation('linear')(actor_mean_output)
actor_variance_output = Dense(nb_actions)(x)
actor_variance_output = Activation('softplus')(actor_variance_output)
actor = Model(input=actor_input, output=[actor_mean_output, actor_variance_output])
print(actor.summary())
#critic_input = Input(shape=env.observation_space.shape)
x = None
x = Convolution2D(32, 8, 8, subsample=(4, 4))(critic_input)
x = Activation('relu')(x)
x = Convolution2D(64, 4, 4, subsample=(2, 2))(x)
x = Activation('relu')(x)
x = Convolution2D(64, 3, 3, subsample=(1, 1))(x)
x = Activation('relu')(x)
x = Flatten()(x)
x = Dense(512)(x)
x = Activation('relu')(x)
x = Dense(1)(x)
x = Activation('linear')(x)
critic = Model(input=critic_input, output=x)
print(critic.summary())
I understand that the error is from the actor network - which is compiled as a theano function( I am using theano backend) on line:165 in a3c.py. I have checked all the input dimensions - but I am still not able to find out why I am getting the dimension mismatch error.
#osbv,Rs,Vs,actions (Environment: Breakout-v0 (nb_actions: 6))
[(1, 84, 84, 1), (1,), (1,), (1, 6)]
This also seems consistent with the network input (correct me if im wrong).
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 84, 84, 1) 0
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D) (None, 20, 20, 32) 2080 input_1[0][0]
____________________________________________________________________________________________________
activation_1 (Activation) (None, 20, 20, 32) 0 convolution2d_1[0][0]
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D) (None, 9, 9, 64) 32832 activation_1[0][0]
____________________________________________________________________________________________________
activation_2 (Activation) (None, 9, 9, 64) 0 convolution2d_2[0][0]
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D) (None, 7, 7, 64) 36928 activation_2[0][0]
____________________________________________________________________________________________________
activation_3 (Activation) (None, 7, 7, 64) 0 convolution2d_3[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 3136) 0 activation_3[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 256) 803072 flatten_1[0][0]
____________________________________________________________________________________________________
activation_4 (Activation) (None, 256) 0 dense_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 6) 1542 activation_4[0][0]
____________________________________________________________________________________________________
dense_3 (Dense) (None, 6) 1542 activation_4[0][0]
____________________________________________________________________________________________________
activation_5 (Activation) (None, 6) 0 dense_2[0][0]
____________________________________________________________________________________________________
activation_6 (Activation) (None, 6) 0 dense_3[0][0]
====================================================================================================
Total params: 877996
____________________________________________________________________________________________________
None
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_2 (InputLayer) (None, 84, 84, 1) 0
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D) (None, 20, 20, 32) 2080 input_2[0][0]
____________________________________________________________________________________________________
activation_7 (Activation) (None, 20, 20, 32) 0 convolution2d_4[0][0]
____________________________________________________________________________________________________
convolution2d_5 (Convolution2D) (None, 9, 9, 64) 32832 activation_7[0][0]
____________________________________________________________________________________________________
activation_8 (Activation) (None, 9, 9, 64) 0 convolution2d_5[0][0]
____________________________________________________________________________________________________
convolution2d_6 (Convolution2D) (None, 7, 7, 64) 36928 activation_8[0][0]
____________________________________________________________________________________________________
activation_9 (Activation) (None, 7, 7, 64) 0 convolution2d_6[0][0]
____________________________________________________________________________________________________
flatten_2 (Flatten) (None, 3136) 0 activation_9[0][0]
____________________________________________________________________________________________________
dense_4 (Dense) (None, 512) 1606144 flatten_2[0][0]
____________________________________________________________________________________________________
activation_10 (Activation) (None, 512) 0 dense_4[0][0]
____________________________________________________________________________________________________
dense_5 (Dense) (None, 1) 513 activation_10[0][0]
____________________________________________________________________________________________________
activation_11 (Activation) (None, 1) 0 dense_5[0][0]
====================================================================================================
Total params: 1678497
____________________________________________________________________________________________________
None