openai-cartpole's Introduction

Random search, hill climbing, policy gradient for CartPole

Simple reinforcement learning algorithms implemented for CartPole on OpenAI gym.

This code goes along with my post about learning CartPole, which is inspired by an OpenAI request for research.

##Algorithms implemented

Random Search: Keep trying random weights between [-1,1] and greedily keep the best set.

Hill climbing: Start from a random initialization, add a little noise evey iteration and keep the new set if it improved.

Policy gradient Use a softmax policy and compute a value function using discounted Monte-Carlo. Update the policy to favor action-state pairs that return a higher total reward than the average total reward of that state. Read my post about learning CartPole for a better explanation of this.

openai-cartpole's Issues

NoSuchDisplayException

i run the code but get the error of :
raise NoSuchDisplayException('Cannot connect to "%s"' % name) pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None" [2017-03-07 11:41:49,394] Finished writing results. You can upload them to the scoreboard via gym.upload('/home/sushuting/workspace/openai-cartpole/cartpole-hill')

discounted monte carlo calculation

heres a cleaner way to calculate discounted rewards:

np.array([sum([gamma**t*r for t, r in enumerate(rewards[i:])]) for i in range(len(rewards))])

Nice tutorial!

Hi Kevin,

Very nice tutorial and clean code. One very small nit to do with as you please: in the cartpole-policygradient.py at the bottom you use a variable called reward to store the cumulative return for an episode. In the run_episode you use the more correct variable name totalreward. I suggest changing reward to return for the sake of accuracy. Corresponding issues exist for the random policy and hill-climbing. This is a minor issue though. Great job!

Cheers,
Zack

Recommend Projects

kvfrans / openai-cartpole Goto Github PK

openai-cartpole's Introduction

Random search, hill climbing, policy gradient for CartPole

openai-cartpole's People

Contributors

Stargazers

Watchers

Forkers

openai-cartpole's Issues

NoSuchDisplayException

discounted monte carlo calculation

Nice tutorial!

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent