View Code? Open in Web Editor NEW

An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.

License: MIT License

Python 94.31% JavaScript 3.21% Shell 2.48%

openai_lab's Introduction

OpenAI Lab

NOTICE: Please use the next version, SLM-Lab.

An experimentation framework for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras.

OpenAI Lab is created to do Reinforcement Learning (RL) like science - theorize, experiment. It provides an easy interface to OpenAI Gym and Keras, with an automated experimentation and evaluation framework.

Features

Unified RL environment and agent interface using OpenAI Gym, Tensorflow, Keras, so you can focus on developing the algorithms.
Core RL algorithms implementations, with reusable modular components for developing deep RL algorithms.
An experimentation framework for running hundreds of trials of hyperparameter optimizations, with logs, plots and analytics for testing new RL algorithms. Experimental settings are stored in standardized JSONs for reproducibility and comparisons.
Automated analytics of the experiments for evaluating the RL agents and environments, and to help pick the best solution.
The Fitness Matrix, a table of the best scores of RL algorithms v.s. the environments; useful for research.

With OpenAI Lab, we could focus on researching the essential elements of reinforcement learning such as the algorithm, policy, memory, and parameter tuning. It allows us to build agents efficiently using existing components with the implementations from research ideas. We could then test the research hypotheses systematically by running experiments.

Read more about the research problems the Lab addresses in Motivations. Ultimately, the Lab is a generalized framework for doing reinforcement learning, agnostic of OpenAI Gym and Keras. E.g. Pytorch-based implementations are on the roadmap.

Implemented Algorithms

A list of the core RL algorithms implemented/planned.

To see their scores against OpenAI gym environments, go to Fitness Matrix.

algorithm	implementation	eval score (pending)
DQN	DQN	-
Double DQN	DoubleDQN	-
Dueling DQN	-	-
Sarsa	DeepSarsa	-
Off-Policy Sarsa	OffPolicySarsa	-
PER (Prioritized Experience Replay)	PrioritizedExperienceReplay	-
CEM (Cross Entropy Method)	next	-
REINFORCE	-	-
DPG (Deterministic Policy Gradient) off-policy actor-critic	ActorCritic	-
DDPG (Deep-DPG) actor-critic with target networks	DDPG	-
A3C (asynchronous advantage actor-critic)	-	-
Dyna	next	-
TRPO	-	-
Q*(lambda)	-	-
Retrace(lambda)	-	-
Neural Episodic Control (NEC)	-	-
EWC (Elastic Weight Consolidation)	-	-

Run the Lab

Next, see Installation and jump to Quickstart.

Timelapse of OpenAI Lab, solving CartPole-v0.

openai_lab's Issues

matplotlib backend

when I run "python3 main.py -e dqn_epsilon" I can see the animation but I don't get the graphs. I get the following error :

[2017-04-06 19:52:04,209] ERROR: Error in trial, terminating further session from dqn_epsilon-2017_04_06_195200_t0_s0
Traceback (most recent call last):
File "/Users/enzo/Desktop/openai_lab-master/rl/experiment.py", line 262, in run
self.run_episode()
File "/Users/enzo/Desktop/openai_lab-master/rl/experiment.py", line 243, in run_episode
self.update_history()
File "/Users/enzo/Desktop/openai_lab-master/rl/experiment.py", line 210, in update_history
self.grapher.plot()
File "/Users/enzo/Desktop/openai_lab-master/rl/analytics.py", line 119, in plot
p1.set_ydata(sys_vars['total_rewards_history'])
File "/usr/local/lib/python3.5/site-packages/matplotlib/lines.py", line 1251, in set_ydata
self.stale = True
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 279, in stale
self.stale_callback(self, val)
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 76, in _stale_axes_callback
self.axes.stale = val
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 279, in stale
self.stale_callback(self, val)
File "/usr/local/lib/python3.5/site-packages/matplotlib/figure.py", line 56, in _stale_figure_callback
self.figure.stale = val
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 279, in stale
self.stale_callback(self, val)
File "/usr/local/lib/python3.5/site-packages/matplotlib/pyplot.py", line 576, in _auto_draw_if_interactive
fig.canvas.draw_idle()
File "/usr/local/lib/python3.5/site-packages/matplotlib/backend_bases.py", line 2032, in draw_idle
self.draw(*args, kwargs)
File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/contextlib.py", line 77, in exit**
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.5/site-packages/matplotlib/backend_bases.py", line 1706, in _idle_draw_cntx
yield
File "/usr/local/lib/python3.5/site-packages/matplotlib/backend_bases.py", line 2032, in draw_idle
self.draw(*args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/matplotlib/backends/backend_agg.py", line 464, in draw
self.figure.draw(self.renderer)
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/matplotlib/figure.py", line 1143, in draw
renderer, self, dsu, self.suppressComposite)
File "/usr/local/lib/python3.5/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
a.draw(renderer)
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/matplotlib/axes/_base.py", line 2409, in draw
mimage._draw_list_compositing_images(renderer, self, dsu)
File "/usr/local/lib/python3.5/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
a.draw(renderer)
File "/usr/local/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
draw(artist, renderer, *args, **kwargs)
File "/usr/local/lib/python3.5/site-packages/matplotlib/lines.py", line 776, in draw
self.recache()
File "/usr/local/lib/python3.5/site-packages/matplotlib/lines.py", line 696, in recache
raise RuntimeError('xdata and ydata must be the same length')
RuntimeError: xdata and ydata must be the same length
[2017-04-06 19:52:04,216] INFO:

End Session #0/1 of Trial #0/1 on PID 15881:
dqn_epsilon-2017_04_06_195200_t0_s0

[2017-04-06 19:52:04,348] INFO: Session complete, data saved to ./data/dqn_epsilon-2017_04_06_195200/dqn_epsilon-2017_04_06_195200_t0.json
[2017-04-06 19:52:04,414] INFO:

End Trial #0/1 on PID 15881:
dqn_epsilon-2017_04_06_195200_t0

[2017-04-06 19:52:04,416] INFO: Session complete, data saved to ./data/dqn_epsilon-2017_04_06_195200/dqn_epsilon-2017_04_06_195200_t0.json
Traceback (most recent call last):
File "main.py", line 5, in
run(args.experiment, **vars(args))
File "/Users/enzo/Desktop/openai_lab-master/rl/experiment.py", line 472, in run
return analyze_data(experiment_data)
File "/Users/enzo/Desktop/openai_lab-master/rl/analytics.py", line 447, in analyze_data
stats_df = raw_stats_df[STATS_COLS]
File "/usr/local/lib/python3.5/site-packages/pandas/core/frame.py", line 2053, in getitem
return self._getitem_array(key)
File "/usr/local/lib/python3.5/site-packages/pandas/core/frame.py", line 2097, in _getitem_array
indexer = self.ix._convert_to_indexer(key, axis=1)
File "/usr/local/lib/python3.5/site-packages/pandas/core/indexing.py", line 1230, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: "['best_session_epi' 'best_session_id' 'best_session_mean_rewards'\n 'best_session_stability' 'fitness_score' 'mean_rewards_per_epi_stats_mean'\n 'mean_rewards_stats_mean' 'mean_rewards_stats_max' 'epi_stats_mean'\n 'epi_stats_min' 'solved_ratio_of_sessions' 'max_total_rewards_stats_mean'\n 'trial_id'] not in index"

Install issue using anaconda

Hi,

I tried installing openai_lab on an anaconda environment using bin/setup.py, but it did not work because the setup script runs sudo python3 which invokes the system python rather than the anaconda python.

In the end I was able to run open_ai by installing all the dependencies by hand, but I think it would be better not to ask for users to use sudo during running the setup script so that a user without a superuser privilege can install openai_lab and that the installation will not interfere with the system.

Recommend Projects

kengz / openai_lab Goto Github PK

openai_lab's Introduction