Giter Club home page Giter Club logo

flare's Introduction

Reinforcement Learning with Latent Flow

Official codebase for Reinforcement Learning with Latent Flow (Flare). This codebase was originally forked from Reinforcement Learning with Augmented Data (RAD).

BibTex

If you use our implementations, please cite:

@unpublished{shang_wang2020flare,
  title={Reinforcement Learning with Latent Flow},
  author={Shang, Wenling and Wang, Xiaofei and Srinivas, Aravind and Rajeswaran, Aravind and Gao, Yang and Abbeel, Pieter and Laskin, Michael},
  booktitle={Neurips Deep Reinforcement Learning Workshop 2020}
}

Installation

All of the dependencies are in the conda_env.yml file. They can be installed manually or with the following command:

conda env create -f conda_env.yml

Main differences between RAD repo and Flare repo

  • For image-based observations, the new Flare architecture is added. For algorithm details, see Flare paper.
  • Training from state-based observations is enabled.
  • The replay buffer for image-based observations is improved, allowing a larger buffer capacity.
  • Training can be performed on multiple GPUs, allowing larger batch size and faster training.

Examples

  • To train a RAD agent on the quadruped walk task from image-based observations with translation augmentation on a single GPU, run bash script/1_quadruped_translate.sh from the root of this directory.
  • To train a Flare-RAD agent on the quadruped walk task from image-based observations with translation augmentation on a single GPU, run bash script/2_quadruped_translate_flare.sh from the root of this directory.
  • To train a SAC agent on the humanoid stand task from state-based observations on a single GPU, run bash script/3_humanoid_state.sh from the root of this directory.
  • To train a RAD agent on the cheetah run task from image-based observations on multiple GPUs, run bash script/4_cheetah_translate_ddp.sh from the root of this directory.

Logging

In your console, you should see printouts that look like this:

| train | E: 13 | S: 2000 | D: 9.1 s | R: 48.3056 | BR: 0.8279 | A_LOSS: -3.6559 | CR_LOSS: 2.7563
| train | E: 17 | S: 2500 | D: 9.1 s | R: 146.5945 | BR: 0.9066 | A_LOSS: -5.8576 | CR_LOSS: 6.0176
| train | E: 21 | S: 3000 | D: 7.7 s | R: 138.7537 | BR: 1.0354 | A_LOSS: -7.8795 | CR_LOSS: 7.3928
| train | E: 25 | S: 3500 | D: 9.0 s | R: 181.5103 | BR: 1.0764 | A_LOSS: -10.9712 | CR_LOSS: 8.8753
| train | E: 29 | S: 4000 | D: 8.9 s | R: 240.6485 | BR: 1.2042 | A_LOSS: -13.8537 | CR_LOSS: 9.4001

The above output decodes as:

train - training episode
E - total number of episodes 
S - total number of environment steps
D - duration in seconds to train 1 episode
R - episode reward
BR - average reward of sampled batch
A_LOSS - average loss of actor
CR_LOSS - average loss of critic

All data related to the run is stored in the specified working_dir. To enable model or video saving, use the --save_model or --save_video flags. For all available flags, inspect train.py. To visualize progress with tensorboard run:

tensorboard --logdir log --port 6006

and go to localhost:6006 in your browser. If you're running headlessly, try port forwarding with ssh.

flare's People

Contributors

wendyshang avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.