sebtheiler / tutorials Goto Github PK

View Code? Open in Web Editor NEW

134.0 134.0 107.0 605 KB

All of the code for my Medium articles

Home Page: https://medium.com/@sebastiankt9

License: MIT License

Jupyter Notebook 86.96% Python 13.04%

education educational educational-project learning learning-by-doing tutorials

tutorials's People

Contributors

Stargazers

Watchers

Forkers

cuda-chen biranchi2018 faizmisman icy4016 stjordanis idahopotato1 iamshivamjaiswal mayurmorin jshang weathersuperman sabbban garcer3 mtdzi mh-github42 jingweimo benblack0902 trungthanhtran rickyzhang82 bonchae vishwas1234567 5ong18 natel9178 rain07 79212 shefalilodha ptwilson yashgupte21 ashishnayak7 filipe-monteiro emildi llt1 martinhavlicek mariacalinescu innovativehumanity vochong dontcryme jonvanveen ashishkrb7 vamsime gokul68 abdelgo ssitb quwsarohi nitz9922 ricelli utanko surya291 haltima ritikaraaj grockious hititan pierre-laurentc sephora-m anhnt2407 amyli98 boringideas deepakgthomas lucianovilasboas yfchenshirley bharadi jamread liuhongbo830117 firestudiox rkpandit bzigon tak113 csunsay farflower shreyan2 junfu1115 reillykayser sevendi tessywangari ankushm17 moku23 sanjanarr sanjeepan23 dabarparihar mann1904 abdullahyahyamohammed patrikwang coral26 ahmethamzakaya zych1108 leixinma srajbamshi phw98064 bhavana47 christine0713 oluwayetty maxhoef blackpearljack zhongzishi paulrschrater mattlegro datatalking ninja-1337 samweljm dazmost debenavides

tutorials's Issues

Agent learns but fairly well

Hello
firstly, thank you for this tutorial.
I succeeded in making the train_dqn run but the training did not lead to a good result.

Game number: 005100 Frame number: 01981609 Average reward: 22.1 Time taken: 38.0s

Is this normal ?
Do you have any suggestions ?

Thanks

Problem with DCGANS tutorial

Hi,

I was trying flow the tutorial for the DCGAN but I have hit a problem.

When I run the following line gan_output = discriminator(fake_image)
I get the error Dimensions must be equal, but are 3 and 3072 for 'sequential_5/dense_14/MatMul' (op: 'MatMul') with input shapes: [?,3], [3072,1024].
I have checked the numbers and they line up with what is defined above in the tutorial.

Can you help?

Cheers

More informative console output DQN tutorial

Dear Sebastian Theiler,

thanks for your great DQN tutorial example. I was wondering if it is possible to suppress the output in the console that looks something like:
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 32ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 34ms/step
1/1 [==============================] - 0s 30ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] -

or make it more informative for each line. Seems to come from Keras?

Every once in a while I see a line like

1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
Game number: 000580 Frame number: 00104874 Average reward: 1.3 Time taken: 12.0s
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 30ms/

coming by.

Or is something going horribly wrong in my environment and console output during training should not be looking like this?

RGB image

one question regarding the replay buffer in DQN atari project

In the class "replay buffer", i found there is "states = np.transpose(np.asarray(states), axes=(0, 2, 3, 1))" in the def get_minibatch.
The problem is why transpose needed here? It looks like the original sequence is already correct so don't need transpose operation here.

Many thanks!

Question regarding the dueling network architecture part

Hi, I found below code in the network part of train_dqn.py

###########################################################

Split into value and advantage streams

val_stream, adv_stream = Lambda(lambda w: tf.split(w, 2, 3))(x) # custom splitting layer
##############################################################################

It looks like the source from hidden network was divided into 2 different partial parts then one feed to state value, another one to adv value. I have also checked other implementations and paper. It looks like each flow should be the complete copy of the hidden layer rather than partial of it. Can i ask why you want to split it rather than feed the same whole data flow to both stat and adv?

Many thanks!
Edward

sebtheiler / tutorials Goto Github PK

tutorials's People

Contributors

Stargazers

Watchers

Forkers

tutorials's Issues

Agent learns but fairly well

DQN rank issue

Problem with DCGANS tutorial

More informative console output DQN tutorial

RGB image

one question regarding the replay buffer in DQN atari project

Question regarding the dueling network architecture part

Split into value and advantage streams

looks like target-network and online-network gets updated at the same frequency UPDATE_FREQ

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent