Giter Club home page Giter Club logo

tonic's Introduction

Tonic



Welcome to the Tonic RL library!

Please take a look at the paper for details and results.

The main design principles are:

  • Modularity: Building blocks for creating RL agents, such as models, replays, or exploration strategies, are implemented as configurable modules.

  • Readability: Agents are written in a simple way with an identical API and logs are nicely displayed on the terminal with a progress bar.

  • Fair comparison: The training pipeline is unique and compatible with all Tonic agents and environments. Agents are defined by their core ideas while general tricks/improvements like non-terminal timeouts, observation normalization and action scaling are shared.

  • Benchmarking: Benchmark data of the provided agents trained on 70 continuous control environments are provided for direct comparison.

  • Wrapped popular environments: Environments from OpenAI Gym, PyBullet and DeepMind Control Suite are made compatible with non-terminal timeouts and synchronous distributed training.

  • Compatibility with different ML frameworks: Both TensorFlow 2 and PyTorch are currently supported. Simply import tonic.tensorflow or tonic.torch.

  • Experimenting from the console: While launch scripts can be used, iterating over various configurations from a console is made possible using snippets of Python code directly.

  • Visualization of trained agents: Experiment configurations and checkpoints can be loaded to play with trained agents.

  • Collection of trained models: To keep the main Tonic repository light, the full logs and trained models from the benchmark are available in the tonic_data repository.

Instructions

Install from source

Download and install Tonic:

git clone https://github.com/fabiopardo/tonic.git
pip install -e tonic/

Install TensorFlow or PyTorch, for example using:

pip install tensorflow torch

Launch experiments

Use TensorFlow or PyTorch to train an agent, for example using:

python3 -m tonic.train \
--header 'import tonic.torch' \
--agent 'tonic.torch.agents.PPO()' \
--environment 'tonic.environments.Gym("BipedalWalker-v3")' \
--name PPO-X \
--seed 0

Snippets of Python code are used to directly configure the experiment. This is a very powerful feature allowing to configure agents and environments with various arguments or even load custom modules without adding them to the library. For example:

python3 -m tonic.train \
--header "import sys; sys.path.append('path/to/custom'); from custom import CustomAgent" \
--agent "CustomAgent()" \
--environment "tonic.environments.Bullet('AntBulletEnv-v0')" \
--seed 0

By default, environments use non-terminal timeouts, which is particularly important for locomotion tasks. But a time feature can be added to the observations to keep the MDP Markovian. See the Time Limits in RL paper for more details. For example:

python3 -m tonic.train \                                                                                  โŽ
--header 'import tonic.tensorflow' \
--agent 'tonic.tensorflow.agents.PPO()' \
--environment 'tonic.environments.Gym("Reacher-v2", terminal_timeouts=True, time_feature=True)' \
--seed 0

Distributed training can be used to accelerate learning. In Tonic, groups of sequential workers can be launched in parallel processes using for example:

python3 -m tonic.train \
--header "import tonic.tensorflow" \
--agent "tonic.tensorflow.agents.PPO()" \
--environment "tonic.environments.Gym('HalfCheetah-v3')" \
--parallel 10 --sequential 100 \
--seed 0

Plot results

During training, the experiment configuration, logs and checkpoints are saved in environment/agent/seed/.

Result can be plotted with:

python3 -m tonic.plot --path BipedalWalker-v3/ --baselines all

Regular expressions like BipedalWalker-v3/PPO-X/0, BipedalWalker-v3/{PPO*,DDPG*} or *Bullet* can be used to point to different sets of logs. The --baselines argument can be used to load logs from the benchmark. For example --baselines all uses all agents while --baselines A2C PPO TRPO will use logs from A2C, PPO and TRPO.

Different headers can be used for the x and y axes, for example to compare the gain in wall clock time of using distributed training, replace --parallel 10 with --parallel 5 in the last training example and plot the result with:

python3 -m tonic.plot --path HalfCheetah-v3/ --x_axis train/seconds --x_label Seconds

Play with trained models

After some training time, checkpoints are generated and can be used to play with the trained agent:

python3 -m tonic.play --path BipedalWalker-v3/PPO-X/0

Environments are rendered using the appropriate framework. For example, when playing with DeepMind Control Suite environments, policies are loaded in a dm_control.viewer where Space is used to start the interaction, Backspace is used to start a new episode, [ and ] are used to switch cameras and double click on a body part followed by Ctrl + mouse clicks is used to add perturbations.

Play with models from tonic_data

The tonic_data repository can be downloaded with:

git clone https://github.com/fabiopardo/tonic_data.git

The best seed for each agent is stored in environment/agent and can be reloaded using for example:

python3 -m tonic.play --path tonic_data/tensorflow/humanoid-stand/TD3


The full benchmark plots are available here.

They can be generated with:

python3 -m tonic.plot \
--baselines all \
--backend agg --columns 7 --font_size 17 --legend_font_size 30 --legend_marker_size 20 \
--name benchmark

Or:

python3 -m tonic.plot \
--path tonic_data/tensorflow \
--backend agg --columns 7 --font_size 17 --legend_font_size 30 --legend_marker_size 20 \
--name benchmark

And a selection can be generated with:

python3 -m tonic.plot \
--path tonic_data/tensorflow/{AntBulletEnv-v0,BipedalWalker-v3,finger-turn_hard,fish-swim,HalfCheetah-v3,HopperBulletEnv-v0,Humanoid-v3,quadruped-walk,swimmer-swimmer15,Walker2d-v3} \
--backend agg --columns 5 --font_size 20 --legend_font_size 30 --legend_marker_size 20 \
--name selection


Credit

Other code bases

Tonic was inspired by a number of other deep RL code bases. In particular, OpenAI Baselines, Spinning Up in Deep RL and Acme.

Citing Tonic

If you use Tonic in your research, please cite the paper:

@article{pardo2020tonic,
    title={Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking},
    author={Pardo, Fabio},
    journal={arXiv preprint arXiv:2011.07537},
    year={2020}
}

tonic's People

Contributors

fabiopardo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tonic's Issues

Win 10 compatibility issue with gym box (Comment Not an ISSUE)

Unfortunately for some reason, in Windows 10, there will be an assertion error for all the environments, even for the provided ones. It seems to me this is a Win 10 compatibility problem or a version issue with the Numpy or any other packages because simply commenting the following line in the gym\spaces\box.py can resolve the problem.

assert np.isscalar(low) and np.isscalar(high)

For example, without removing that line, the following code is not going to run. However, after commenting, it's working.

python -m tonic.train --header "import tonic.torch" --agent "tonic.torch.agents.PPO()" --environment "tonic.environments.Gym('BipedalWalker-v2')" --name PPO-X --seed 0

Hopefully, this helps others with the same problem.

Training using reference data

Hi

Is it possible to use pre-recorded demonstration data for the training? If yes, which algorithms support this feature?

Thanks

training custom envs

hello, thanks for very nice library with baseline RL implementations, which doesn't require to load tensorflow :)
I have a question how to load a custom environment ? a wrapper is required ?

Minimal plotting example returning error

I ran the minimal working example for BipedalWalker and when I ran the plot script, I got this trace:

(base) samlerman@Marvin vcpkg % python3 -m tonic.plot --path BipedalWalker-v3/ --baselines all Loading data... BipedalWalker-v3 PPO-X Loading TensorFlow baselines... Path: /Users/samlerman/Code/Libraries/tonic/data/logs/tensorflow_logs.pkl BipedalWalker-v3 A2C BipedalWalker-v3 DDPG BipedalWalker-v3 MPO BipedalWalker-v3 PPO BipedalWalker-v3 SAC BipedalWalker-v3 TD3 BipedalWalker-v3 TRPO Plotting... Traceback (most recent call last): File "/Users/samlerman/code/miniforge3/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/Users/samlerman/code/miniforge3/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/Users/samlerman/Code/Libraries/tonic/tonic/plot.py", line 469, in <module> fig = plot(**vars(args), fig=None) File "/Users/samlerman/Code/Libraries/tonic/tonic/plot.py", line 399, in plot target_width = h_packer.get_extent(renderer)[0] File "/Users/samlerman/code/miniforge3/lib/python3.9/site-packages/matplotlib/offsetbox.py", line 341, in get_extent w, h, xd, yd, offsets = self.get_extent_offsets(renderer) File "/Users/samlerman/code/miniforge3/lib/python3.9/site-packages/matplotlib/offsetbox.py", line 540, in get_extent_offsets dpicor = renderer.points_to_pixels(1.) AttributeError: 'NoneType' object has no attribute 'points_to_pixels'

[Feedback] Please consider using Hydra

Hello @fabiopardo :)

Thank you for open-sourcing tonic. It looks pretty interesting and useful. I have one feedback:

While I can understand the motivation of using snippets of Python code to configure the experiment, specifying these things via cmd makes the user experience much less pleasent. May I suggest you consider using Hydra. I think it will improve the user experience and make the development workflow easier. This blog post is a pretty good intro to Hydra.

No support/benchmark results for Atari games

It seems the current version doesn't support games like Breakout. The following code raise an error:

python -m tonic.train --header "import tonic.torch" --agent "tonic.torch.agents.PPO()" --environment "tonic.environments.Gym('Breakout-v0')" --name PPO-X --seed 0

AssertionError: assert isinstance(env.action_space, gym.spaces.Box)

Is that supposed to happen or the code is ready to use for other environments too other than the ones that you've provided the benchmarks. BTW, I think the benchmark idea for fast comparison is really cool. Thanks for sharing!

Plotting bug

Hi, renderer from plot.py kept returning None and as a result I kept getting this error:

AttributeError: 'NoneType' object has no attribute 'points_to_pixels'

To fix it, I just commented out these lines:

        renderer = legend_ax.get_renderer_cache()
        h_packer = legend.get_children()[0].get_children()[1]
        target_width = h_packer.get_extent(renderer)[0]
        current_width = sum(
            [ch.get_extent(renderer)[0] for ch in h_packer.get_children()])
        if target_width > 1.3 * current_width:
            break

Then plotting seemed to work fine, but may I ask, what do those lines do? The figure looks fine without them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.